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Abstract 

Most  economic  analyses  presume  that  there  are  limited  differences  in  the  prior  beliefs  of 
individuals,  an  assumption  most  often  justified  by  the  argument  that  sufficient  common  ex- 
periences and  observations  will  eliminate  disagreements.  We  investigate  this  claim  using  a 
simple  model  of  Bayesian  learning.  Two  individuals  with  different  priors  observe  the  same 
infinite  sequence  of  signals  about  some  underlying  parameter.  Existing  results  in  the  literature 
establish  that  when  individuals  are  certain  about  the  interpretation  of  signals,  under  very  mild 
conditions  there  will  be  asymptotic  agreement — their  assessments  will  eventually  agree.  In  con- 
trast, we  look  at  an  environment  in  which  individuals  are  uncertain  about  the  interpretation  of 
signals,  meaning  that  they  have  non-degenerate  probability  distributions  over  the  conditional 
distribution  of  signals  given  the  underlying  parameter.  When  priors  on  the  parameter  and  the 
conditional  distribution  of  signals  have  full  support,  we  prove  the  following  results:  (1)  Indi- 
viduals will  never  agree,  even  after  observing  the  same  infinite  sequence  of  signals.  (2)  Before 
observing  the  signals,  they  believe  with  probability  1  that  their  posteriors  about  the  underlying 
parameter  will  fail  to  converge.  (3)  Observing  the  same  sequence  of  signals  may  lead  to  a  di- 
vergence of  opinion  rather  than  the  typically-presumed  convergence.  We  then  characterize  the 
conditions  for  asymptotic  agreement  under  "approximate  certainty" — i.e.,  as  we  look  at  the 
limit  where  uncertainty  about  the  interpretation  of  the  signals  disappears.  When  the  family 
of  probability  distributions  of  signals  given  the  parameter  has  "rapidly-varying  tails"  (such 
as  the  normal  or  the  exponential  distributions),  approximate  certainty  restores  asymptotic 
agreement.  However,  when  the  family  of  probability  distributions  has  "regularly-varying  tails" 
(such  as  the  Pareto,  the  log-normal,  and  the  t-distributions) ,  asymptotic  agreement  does  not 
obtain  even  in  the  limit  as  the  amount  of  uncertainty  disappears. 

Lack  of  common  priors  has  important  implications  for  economic  behavior  in  a  range  of 
circumstances.  We  illustrate  how  the  type  of  learning  outlined  in  this  paper  interacts  with 
economic  behavior  in  various  different  situations,  including  games  of  common  interest,  coordi- 
nation, asset  trading  and  bargaining. 

Keywords:  asymptotic  disagreement,  Bayesian  learning,  merging  of  opinions. 
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1      Introduction 

The  common  prior  assumption  is  one  of  the  cornerstones  of  modern  economic  analysis.  Most 
models  postulate  that  the  players  in  a  game  have  the  "same  model  of  the  world,"  or  more 
precisely,  that  they  have  a  common  prior  about  the  game  form  and  payoff  distributions — 
for  example,  they  all  agree  that  some  payoff-relevant  parameter  vector  6  is  drawn  from  a 
known  distribution  G,  even  though  each  may  also  have  additional  information  about  some 
components  of  6.  The  typical  justification  for  the  common  prior  assumption  comes  from 
learning;  individuals,  through  their  own  experiences  and  the  communication  of  others,  will 
have  access  to  a  history  of  events  informative  about  the  vector  d,  and  this  process  will  lead  to 
"agreement"  among  individuals  about  the  distribution  of  the  vector  6.  A  strong  version  of  this 
view  is  expressed  in  Savage  (1954,  p.  48)  as  the  statement  that  a  Bayesian  individual,  who 
does  not  assign  zero  probability  to  "the  truth,"  will  learn  it  eventually  as  long  as  the  signals 
are  informative  about  the  truth.  A  more  sophisticated  version  of  this  conclusion  also  follows 
from  Blackwell  and  Dubins'  (1962)  theorem  about  the  "merging  of  opinions".^ 

Despite  these  powerful  intuitions  and  theorems,  disagreement  is  the  rule  rather  than  the 
exception  in  practice.  Just  to  mention  a  few  instances,  there  is  typically  considerable  disagree- 
ment even  among  economists  working  on  a  certain  topic.  For  example,  economists  routinely 
disagree  about  the  role  of  monetary  policy,  the  impact  of  subsidies  on  investment  or  the  mag- 
nitude of  the  returns  to  schooling.  Similarly,  there  are  deep  divides  about  religious  beliefs 
within  populations  with  shared  experiences,  and  finally,  there  was  recently  considerable  dis- 
agreement among  experts  with  access  to  the  same  data  about  whether  Iraq  had  weapons  of 
mass  destruction.  In  none  of  these  cases,  can  the  disagreements  be  traced  to  individuals  having 
access  to  different  histories  of  observations.  Rather  it  is  their  interpretations  that  differ.  In 
particular,  it  seems  that  an  estimate  showing  that  subsidies  increase  investment  is  interpreted 
very  differently  by  two  economists  starting  with  different  priors;  for  example,  an  economist 
believing  that  subsidies  have  no  effect  on  investment  appears  more  likely  to  judge  the  data  or 
the  methods  leading  to  this  estimate  to  be  unreliable  and  thus  to  attach  less  importance  to  this 
evidence.  Similarly,  those  who  believed  in  the  existence  of  weapons  of  mass  destruction  in  Iraq 


Blackwell  and  Dubins'  (1962)  theorem  shows  that  if  two  probability  measures  are  absolutely  continuous  with 
respect  to  each  other  (meaning  that  they  assign  positive  probability  to  the  same  events),  then  as  the  number  of 
observations  goes  to  infinity,  their  predictions  about  future  frequencies  will  agree.  This  is  also  related  to  Doob's 
(1948)  consistency  theorem  for  Bayesian  posteriors,  which  we  discuss  and  use  below. 


presumably  interpreted  the  evidence  from  inspectors  and  journalists  indicating  the  opposite  as 
biased  rather  than  informative. 

In  this  paper,  we  show  that  this  type  of  behavior  will  be  the  outcome  of  learning  by 
Bayesian  individuals  with  different  priors  when  they  are  uncertain  about  the  informativeness 
of  signals.  In  particular,  we  consider  the  following  simple  environment:  one  or  two  individuals 
with  given  priors  observe  a  sequence  of  signals,  {st}"^Q,  and  form  their  posteriors  about  some 
underlying  state  variable  (or  parameter)  9.  The  only  non-standard  feature  of  the  environment 
is  that  these  individuals  may  be  uncertain  about  the  distribution  of  signals  conditional  on 
the  underlying  state.  In  the  simplest  case  where  the  state  and  the  signal  are  binary,  e.g., 
9  e  {A,B},  and  st  G  {a,  &},  this  implies  that  Pr(st  =  9  \  9)  —  pg  is  not  a  known  number, 
but  individuals  also  have  a  prior  over  p0,  say  given  by  Fg.  We  refer  to  this  distribution  Fg  as 
individuals'  subjective  probability  distribution  and  to  its  density  fg  as  subjective  (probability) 
density.  This  distribution,  which  can  differ  among  individuals,  is  a  natural  measure  of  their 
uncertainty  about  the  informativeness  of  signals.  When  subjective  probability  distributions 
are  non- degenerate,  individuals  will  have  some  latitude  in  interpreting  the  sequence  of  signals 
they  observe. 

We  identify  conditions  under  which  Bayesian  updating  leads  to  asymptotic  learning  (indi- 
viduals learning,  or  believing  that  they  will  be  learning,  the  true  value  of  9  with  probability  1 
after  observing  infinitely  many  signals)  and  asymptotic  agreement  (convergence  between  their 
assessments  of  the  value  of  9).  We  first  provide  some  generalizations  of  existing  results  on 
asymptotic  learning  and  agreement.  First,  we  show  that  learning  under  certainty  leads  to  as- 
ymptotic learning  and  agreement.  In  particular,  when  each  individual  i  is  sure  that  pg  =  p'  for 
some  known  number  p'  >  1/2  (with  possibly  p^  ^  p^),  then  asymptotic  learning  and  agreement 
are  guaranteed.  Second,  we  establish  the  stronger  results  that  when  both  individuals  attach 
probability  1  to  the  event  that  pg  >  1/2  for  9  G  {A,  B},  then  there  will  again  be  asymptotic 
learning  and  agreement. 

These  positive  results  do  not  hold,  however,  when  there  is  positive  probability  that  pg  might 
be  less  than  1/2.  In  particular,  when  Fg  has  a  full  support  for  each  9,  we  show  that: 

1.  There  will  not  be  asymptotic  learning.  Instead  each  individual's  posterior  of  9  continues 
to  be  a  function  of  his  prior. 

2.  There  will  not  be  asymptotic  agreement;  two  individuals  with  different  priors  observing 


the  same  sequence  of  signals  will  reach  different  posterior  beliefs  even  after  observing 
infinitely  many  signals.  Moreover,  individuals  attach  ex  ante  probability  1  that  they  will 
disagree  after  observing  the  sequence  of  signals. 

3.  Two  individuals  may  disagree  more  after  observing  a  common  sequence  of  signals  than 
they  did  so  previously.  In  fact,  for  any  model  of  learning  under  uncertainty  that  satisfies 
the  full  support  assumption,  there  exists  an  open  set  of  pairs  of  priors  such  that  the 
disagreement  between  the  two  individuals  will  necessarily  grow  starting  from  these  priors. 

While  it  may  appear  plausible  that  the  individuals  should  not  attach  zero  probability  to 
the  event  that  pe  <  1/2,  it  is  also  reasonable  to  expect  that  the  probability  of  such  events 
should  be  relatively  small.  This  raises  the  question  of  whether  the  results  regarding  the  lack 
of  asymptotic  learning  and  agreement  under  uncertainty  survive  when  there  is  a  small  amount 
of  uncertainty.  Put  differently,  we  would  like  to  understand  whether  the  asymptotic  learning 
and  agreement  results  under  certainty  are  robust  to  a  small  amount  of  uncertainty. 

We  investigate  this  issue  by  studying  learning  under  "approximate  certainty,"  i.e.,  by  con- 
sidering a  family  of  subjective  density  functions  {fm}  that  become  more  and  more  concentrated 
around  a  single  point — thus  converging  to  full  certainty.  It  is  straightforward  to  see  that  as 
each  individual  becomes  more  and  more  certain  about  the  interpretation  of  the  signals,  as- 
ymptotic learning  obtains.  Interestingly,  however,  the  conditions  for  asymptotic  agreement  are 
much  more  demanding  than  those  for  cisymptotic  learning.  Consequently,  even  though  each 
individual  expects  to  learn  the  payoff-relevant  parameters,  asymptotic  agreement  may  fail  to 
obtain.  This  implies  that  asymptotic  agreement  under  certainty  may  be  a  discontinuous  limit 
point  of  a  general  model  of  learning  under  uncertainty.  We  show  that  whether  or  not  this 
is  the  case  depends  on  the  tail  properties  of  the  family  of  subjective  density  functions  {fm}- 
When  this  family  has  regularly-varying  tails  (such  as  the  Pareto  or  the  log-normal  distribu- 
tions), even  under  approximate  certainty  there  will  be  asymptotic  disagreement.  When  {fm} 
has  rapidly- varying  tails  (such  as  the  normal  distribution),  there  will  be  asymptotic  agreement 
under  approximate  certainty. 

Intuitively,  approximate  certainty  is  sufficient  to  make  each  individual  believe  that  they 
will  learn  the  payoff- relevant  parameter,  but  they  may  still  believe  that  the  other  individual 
will  fail  to  learn.  Whether  or  not  they  believe  this  depends  on  how  an  individual  reacts  when 
a  frequency  of  signals  different  from  the  one  he  expects  with  "almost  certainty"  occurs.    If 


this  event  prevents  the  individual  from  learning,  then  there  will  be  asymptotic  disagreement 
under  approximate  certainty.  This  is  because  under  approximate  certainty,  each  individual 
trusts  his  own  model  of  the  world  and  thus  expects  the  limiting  frequencies  to  be  consistent 
with  his  model.  When  the  other  individual's  model  of  the  world  differs,  he  expects  the  other 
individual  to  be  surprised  by  the  limiting  frequencies  of  the  signals.  Then  whether  or  not 
asymptotic  agreement  will  obtain  depends  on  whether  this  surprise  is  sufficient  to  prevent  the 
other  individual  from  learning,  which  in  turn  depends  on  the  tail  properties  of  the  family  of 
subjective  density  functions  {/m}- 

Lack  of  asymptotic  agreement  has  important  implications  for  a  range  of  economic  situa- 
tions. We  illustrate  some  of  these  by  considering  a  number  of  simple  environments  where  two 
individuals  observe  the  same  sequence  of  signals  before  or  while  playing  a  game.  In  particular, 
we  discuss  the  implications  of  learning  in  uncertain  environments  for  games  of  coordination, 
games  of  common  interest,  bargaining,  games  of  communication  and  asset  trading.  We  show 
how,  when  they  are  learning  under  uncertainty,  individuals  will  play  these  games  differently 
than  they  would  in  environments  with  common  priors — and  also  differently  than  in  environ- 
ments without  common  priors  but  where  learning  takes  place  under  certainty.  For  example, 
we  establish  that  contrary  to  standard  results,  individuals  may  wish  to  play  games  of  common 
interests  before  receiving  more  information  about  payoffs.  We  also  show  how  the  possibility  of 
observing  the  same  sequence  of  signals  may  lead  individuals  to  trade  only  after  they  observe 
the  public  information.  This  result  contrasts  with  both  standard  no-trade  theorems  (e.g.,  Mil- 
grom  and  Stokey,  1982)  and  existing  results  on  asset  trading  without  common  priors,  which 
assume  learning  under  certainty  (Harrison  and  Kreps,  1978,  and  Morris,  1996).  Finally,  we 
provide  a  simple  example  illustrating  a  potential  reason  why  individuals  may  be  uncertain 
about  informativeness  of  signals — the  strategic  behavior  of  other  agents  trying  to  manipulate 
their  beliefs. 

Our  results  cast  doubt  on  the  idea  that  the  common  prior  assumption  may  be  justified  by 
learning.  In  many  environments,  even  when  there  is  little  uncertainty  so  that  each  individual 
believes  that  he  will  learn  the  true  state,  learning  does  not  necessarily  imply  agreement  about 
the  relevant  parameters.  Consequently,  the  strategic  outcome  may  be  significantly  different 
from  that  of  the  common-prior  environment.^  Whether  this  assumption  is  warranted  therefore 


For  previous  arguments  on  whether  game-theoretic  models  should  be  formulated  with  all  individuals  having 
a  common  prior,  see,  for  example,  Aumann  (1986,  1998)  and  Gul  (1998).   Gul  (1998),  for  example,  questions 


depends  on  the  specific  setting  and  what  type  of  information  individuals  are  trying  to  glean 
from  the  data. 

Relating  our  results  to  the  famous  Blackwell-Dubins  (1962)  theorem  may  help  clarify  their 
essence.  As  briefly  mentioned  in  Footnote  1,  this  theorem  shows  that  when  two  agents  agree 
on  zero-probability  events  (i.e.,  their  priors  are  absolutely  continuous  with  respect  to  each 
other),  asymptotically,  they  will  make  the  same  predictions  about  future  frequencies  of  signals. 
Our  results  do  not  contradict  this  theorem,  since  we  impose  absolute  continuity  throughout. 
Instead,  our  results  rely  on  the  fact  that  agreeing  about  future  frequencies  is  not  the  same 
as  agreeing  about  the  underlying  state  (or  the  underlying  payoff  relevant  parameters).'^  Put 
differently,  under  uncertainty,  there  is  an  "identification  problem"  making  it  impossible  for 
individuals  to  infer  the  underlying  state  from  limiting  frequencies  and  this  leads  to  different 
interpretations  of  the  same  signal  sequence  by  individuals  with  different  priors.  In  most  eco- 
nomic situations,  what  is  important  is  not  future  frequencies  of  signals  but  some  payoff-relevant 
parameter.  For  example,  what  was  essential  for  the  debate  on  the  weapons  of  mass  destruction 
was  not  the  frequency  of  news  about  such  weapons  but  whether  or  not  they  existed.  What 
is  relevant  for  economists  trying  to  evaluate  a  policy  is  not  the  frequency  of  estimates  on  the 
effect  of  similar  policies  from  other  researchers,  but  the  impact  of  this  specific  policy  when 
(and  if)  implemented.  Similarly,  what  may  be  relevant  in  trading  assets  is  not  the  frequency 
of  information  about  the  dividend  process,  but  the  actual  dividend  that  the  asset  will  pay. 
Thus,  many  situations  in  which  individuals  need  to  learn  about  a  parameter  or  state  that 
will  determine  their  ultimate  payoff  as  a  function  of  their  action  falls  within  the  realm  of  the 
analysis  here. 

In  this  respect,  our  work  differs  from  papers,  such  as  Freedman  (1963,  1965)  and  Miller 
and  Sanchirico  (1999),  that  question  the  applicability  of  the  absolute  continuity  assumption 
in  the  Blackwell-Dubins  theorem  in  statistical  and  economic  settings  (see  also  Diaconis  and 
Freedman,  1986,  Stinchcombe,  2005).  Similarly,  a  number  of  important  theorems  in  statistics, 
for  example.  Berk  (1966),  show  that  under  certain  conditions,  limiting  posteriors  will  have 
their  support  on  the  set  of  all  identifiable  values  (though  they  may  fail  to  converge  to  a 
limiting  distribution).  Our  results  are  different  from  those  of  Berk  both  because  in  our  model 


whether  the  common  prior  assumption  maizes  sense  when  there  is  no  ex  ante  stage. 

■'in  this  respect,  our  paper  is  also  related  to  Kurz  (1994,  1996),  who  considers  a  situation  in  which  agents 
agree  about  long-run  frequencies,  but  their  beliefs  fail  to  merge  because  of  the  non-stationarity  of  the  world. 


individuals  always  place  positive  probability  on  the  truth  and  also  because  we  provide  a  tight 
characterization  of  the  conditions  for  lack  of  asymptotic  learning  and  agreement. 

Our  paper  is  also  closely  related  to  recent  independent  work  by  Cripps,  Ely,  Mailath  and 
Samuelson  (2006),  who  study  the  conditions  under  which  there  will  be  "common  learning"  by 
two  agents  observing  correlated  signals.  They  show  how  individual  learning  may  not  lead  to 
"common  knowledge"  when  the  signal  space  is  infinite.  Cripps,  Ely,  Mailath  and  Samuelson's 
analysis  focuses  on  the  case  in  which  the  agents  start  with  common  priors  and  learn  under 
certainty  (though  they  note  how  their  results  can  be  extended  to  the  case  of  non-common 
priors).  Consequently,  our  emphasis  on  learning  under  uncertainty  and  the  results  on  learning 
under  conditions  of  approximate  certainty  are  not  shared  by  this  paper.  Nevertheless,  there 
is  a  close  connection  between  our  result  that  under  approximate  certainty  each  agent  expects 
that  he  will  learn  the  payoff-relevant  parameters  but  that  he  will  disagree  with  the  other  agent 
and  Cripps,  Ely,  Mailath  and  Samuelson's  finding  of  lack  of  common  learning  with  infinite- 
dimensional  signal  spaces. 

The  rest  of  the  paper  is  organized  as  follows.  Section  2  provides  all  our  main  results  in  the 
context  of  a  two-state  two-signal  setup.  Section  3  provides  generalizations  of  these  results  to 
an  environment  with  K  states  and  L>  K  signals.  Section  4  considers  a  variety  of  applications 
of  our  results,  and  Section  5  concludes. 

2     The  Two-State  Model 
2.1     Environment 

We  start  with  a  two-state  model  with  binary  signals.  This  model  is  sufficient  to  establish  all 
our  main  results  in  the  simplest  possible  setting.  These  results  are  generalized  to  arbitrary 
number  of  states  and  signal  values  in  Section  3. 

There  are  two  individuals,  denoted  by  i  =  1  and  i  —  2,  who  observe  a  sequence  of  signals 
{st}"_Q  where  St  £  {a,  b].  The  underlying  state  is  ^  e  {A,  B],  and  agent  i  assigns  ex  ante  prob- 
ability tt'  6  (0,  \)  to  9  =  A.  The  individuals  believe  that,  given  9,  the  signals  are  exchangeable, 
i.e.,  they  are  independently  and  identically  distributed  with  an  unknown  distribution.^  That 


■"See,  for  example,  Billingsley  (1995).  If  there  were  only  one  state,  then  our  model  would  be  identical  to  De 
Finetti's  canonical  model  (see,  for  example,  Savage,  1954).  In  the  context  of  this  model,  De  Finetti's  theorem 
provides  a  Bayesian  foundation  for  classical  probability  theory  by  showing  that  exchangeability  (i.e.,  invariance 
under  permutations  of  the  order  of  signals)  is  equivalent  to  having  an  independent  identical  unknown  distrib- 
ution and  implies  that  posteriors  converge  to  long-run  frequencies.    De  Finetti's  decomposition  of  probability 


PA 

1  -PB 

1  -PA 

PB 

is,  the  probability  of  sj  =  a  given  9  =  A  is  B.n  unknown  number  p^;  likewise,  the  probability 
of  Si  =  6  given  6*  =  5  is  an  unknown  number  pB — as  shown  in  the  following  table: 

A  B 

a 
b 

Our  main  departure  from  the  standard  models  is  that  we  allow  the  individuals  to  be 
uncertain  about  pA  and  pB  ■  We  denote  the  cumulative  distribution  function  of  po  according 
to  individual  i — i.e.,  his  subjective  probability  distribution — by  Fg.  In  the  standard  models,  F^ 
is  degenerate  and  puts  probability  1  at  some  pg.  In  contrast,  for  most  of  the  analysis,  we  will 
impose  the  following  assumption: 

Assumption  1  For  each  i  and  9,  Fg  has  a  continuous,  non-zero  and  finite  density  fg  over 
[0,1]. 

The  assumption  implies  that  Fg  has  full  support  over  [0, 1].  It  is  worth  noting  that  while 
this  assumption  allows  Fg  (p)  and  Fg  (p)  to  differ,  for  many  of  our  results  it  is  not  important 
whether  or  not  this  is  so  (i.e.,  whether  or  not  the  two  individuals  have  a  common  prior  about 
the  distribution  o(  pg).  Moreover,  as  discussed  in  Remark  2,  Assumption  1  is  stronger  than 
necessary  for  our  results,  but  simplifies  the  exposition. 

In  addition,  throughout  we  assume  that  7r\  tt^,  Fg  and  Fg  are  known  to  both  individuals.^ 
We  consider  infinite  sequences  s  =  {sf}^^  of  signals  and  write  S  for  the  set  of  all  such 
sequences.  The  posterior  belief  of  individual  i  about  9  after  observing  the  first  n  signals  {st]^^^ 
is 

where  Pr^  {9  =  A  \  {sf}JLj)  denotes  the  posterior  probability  that  9  —  A  given  a  sequence  of 
signals  {sf}"^i  under  prior  tt*  and  subjective  probability  distribution  Fg  (see  footnote  7  for  a 
formal  definition). 

Throughout,  without  loss  of  generality,  we  suppose  that  in  reality  9  =  A.  The  two  questions 
of  interest  for  us  are: 


distributions  is  extended  by  Jackson,  Kalai  and  Smorodinsky  (1999)  to  cover  cases  without  exchangeability. 

°  The  assumption  that  player  1  knows  the  prior  and  probability  assessment  of  player  2  regarding  the  distri- 
bution of  signals  given  the  state  is  used  in  the  "asymptotic  agreement"  results  and  in  applications.  Since  our 
purpose  is  to  understand  whether  learning  justifies  tlie  common  prior  assumption,  we  assume  that  agents  do 
not  change  their  views  because  the  beliefs  of  others  differ  from  theirs. 


1.  Asymptotic  lecirning:  whether  Pr'  (Hm„^oo  (Ki  (s)  =  l\9  =  A)  =  1  for  z  =  1, 2. 

2.  Asymptotic  agreement:  whether  Pr'  (Hm„_,oo  \'Pn  i^)  ~  'Pn{^)\  —  O)  =  1  for  z  =  1,  2. 

Notice  that  both  asymptotic  learning  and  agreement  are  defined  in  terms  of  the  ex  ante 
probabihty  assessments  of  the  two  individuals.  Therefore,  asymptotic  learning  implies  that  an 
individual,  believes  that  he  or  she  will  ultimately  learn  the  truth,  while  asymptotic  agreement 
implies  that  both  individuals  believe  that  their  assessments  will  eventually  converge.^ 

2.2     Asymptotic  Lecirning  and  Disagreement 

The  following  theorem  gives  the  well-known  result,  which  applies  when  Assumption  1  does  not 
hold.  A  version  of  this  result  is  stated  in  Savage  (1954)  and  also  follows  from  Blackwell  and 
Dubins'  (1962)  more  general  theorem  applied  to  this  case.  Since  the  proof  of  this  theorem 
uses  different  arguments  than  those  presented  below  and  is  tangential  to  our  focus  here,  it  is 
relegated  to  the  Appendix. 

Theorem  1  Assume  that  for  some  p^,fP'  £  (1/2,1],  each  Fg  puts  probability  1  on  p%  i.e., 
Fg  (p')  =  1  and  Fg  (p)  =  0  for  each  p  <  p^.   Then,  for  each  i  —  1,2, 

1.  Pr^(lim„_oo<^Us)  =  l|0  =  A)  =  l. 

2.  Fv'  (lim„^oo  1^^  (s)  -<t>lis)\=0)==l. 

Theorem  1  is  a  slightly  generalized  version  of  the  standard  theorem  where  the  individual 
will  learn  the  truth  with  experience  (almost  surely  as  n  ^  oo)  and  two  individuals  observing 
the  same  sequence  will  necessarily  agree.  The  generalization  arises  from  the  fact  that  learning 
and  agreement  take  place  even  though  p^  may  differ  from  p^  (while  Savage,  1954,  assumes 
that  p^  =  p~).  The  intuition  of  this  theorem  is  useful  for  understanding  the  results  that  will 
follow.  The  theorem  states  that  even  if  the  two  individuals  have  different  expectations  about 
the  probability  oi  st  =  a  conditional  on  9  =  A,  the  fact  that  p*  >  1/2  and  that  they  hold 
these  beliefs  with  certainty  is  sufficient  for  asymptotic  learning  and  agreement.  For  example, 


We  formulate  asymptotic  learning  in  terms  of  each  individual's  initial  probability  measure  so  as  not  to  take 
a  position  on  what  the  "objective"  for  "true"  probability  measure  is. 

In  terms  of  asymptotic  agreement,  we  will  see  that  Pr'  (lim„_oo  {(p^  (*)  "~  "^n  (*)|  —  O)  =  1  also  implies 
limn—oo  |0j,  (s)  —  4>'^{s)\  =0  for  almost  all  sample  paths,  thus  individual  beliefs  that  there  will  be  asymp- 
totic agreement  coincide  with  asymptotic  agreement  (and  vice  versa). 


consider  an  individual  who  expects  a  frequency  of  a  signals  p*  >  1/2  when  the  underlying  state 
is  6  =  A.  First,  to  see  why  asymptotic  learning  applies  it  is  sufficient  to  observe  that  this 
individual  is  sure  that  he  will  be  confronted  either  with  a  limiting  frequency  of  a  signals  equal 
to  p\  in  which  case  he  will  conclude  that  0  =  A,  or  he  will  observe  a  limiting  frequency  of 
1  —  p',  and  he  will  conclude  that  6  —  B.  Therefore,  this  individual  believes  that  he  will  learn 
the  true  state  with  probability  1.  Next  to  see  why  asymptotic  agreement  obtains,  suppose 
that  this  individual  is  confronted  with  a  frequency  p  >  p^  of  a  signals.  Since  he  believes  with 
certainty  that  the  frequency  of  signals  should  be  p*  when  the  state  is  9  =  A  and  1  —  p''  when 
the  state  is  9  =  B,  he  will  interpret  the  frequency  p  as  resulting  from  samphng  variation. 
Given  that  p  >  p^,  this  sampling  variation  is  much  much  more  likely  when  the  state  is  9  ^  A 
and  therefore,  he  will  attach  probability  1  to  the  event  that  9  —  A.  Asymptotic  agreement 
then  follows  from  the  observation  that  individual  i  believes  that  individual  j  will  observe  a 
frequency  of  a  signals  p'  when  the  state  is  ^  =  A  and  expects  that  he  will  conclude  from  this 
that  9  =  A  even  though  p^  ^  fP  (as  long  as  p'  and  fp  are  both  greater  than  1/2  as  assumed  in 
the  theorem). 

We  next  generalize  Theorem  1  to  the  case  where  the  individuals  are  not  necessarily  certain 
about  the  signal  distribution  but  their  subjective  distributions  do  not  satisfy  the  full  support 
feature  of  Assumption  1 . 

Theorem  2  Assume  that  each  Fq  has  a  density  fg  and  satisfies  Fg  (1/2)  —  0.  Then,  for  each 
i  =  l,2, 

1.  Pr'(lim„^oo<^Us)  =  l|e  =  A)-l. 

2.  Pr'  (lim„^oo  \4>l  is)  -<pl{s)\^Q)^l. 

This  theorem  will  be  proved  together  with  Theorem  3.  It  is  evident  that  the  assumption 
Fq  (1/2)  =  0  implies  that  pg  >  1/2,  contradicting  the  full  support  feature  in  Assumption  1. 
The  intuition  for  this  result  is  similar  to  that  of  Theorem  1:  when  both  individuals  attach 
probability  1  to  the  event  that  pg  >  1/2,  they  will  believe  that  the  majority  of  the  signals  in 
the  limiting  distribution  will  be  Sj  =  a  when  6  =  A.  Thus,  each  believes  that  both  he  and 
the  other  individual  will  learn  the  underlying  state  with  probability  1  (even  though  they  may 
both  be  uncertain  about  the  exact  distribution  of  signals  conditional  on  the  underlying  state). 
This  theorem  shows  that  results  on  asymptotic  learning  and  agreement  are  substantially  more 
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general  than  Savage's  original  theorem.  Nevertheless,  the  result  relies  on  the  feature  that 
F^  (1/2)  =  0  for  each  i  =  1,2  and  each  6.  This  implies  that  both  individuals  attach  zero 
probabihty  to  a  range  of  possible  models  of  the  world — i.e.,  they  are  certain  that  pe  cannot 
be  less  than  1/2.  It  may  instead  be  more  reasonable  to  presume  that,  under  uncertainty, 
each  individual  may  attach  positive  (though  perhaps  small)  probabihty  to  all  values  of  pg  as 
encapsulated  by  Assumption  1.  We  next  impose  this  assumption  and  show  that  under  the  more 
general  circumstances  where  Fg  has  full  support,  there  will  be  neither  asymptotic  learning  nor 
asymptotic  agreement. 

Theorem  3  Suppose  Assumption  1  holds  for  i  =  1,2.   Then, 

1.  Pr'  (lim„^oo  4^n  {s)^l\e^A)=l  for  i  =  1,2; 

2.  Pr'  (lim„-,oo  \<l>li  (s)  -  (t>n  {s)\¥'^)  =  1  whenever  tt^  ^  tt^  and  F^  =  F^  for  each  0  e 
{A,B}. 

This  theorem  contrasts  with  Theorems  1  and  2  and  implies  that  the  individual  in  question 
will  fail  to  learn  the  true  state  with  probability  1.  The  second  part  of  the  theorem  states  that 
if  the  individuals'  prior  beliefs  about  the  state  differ  (but  they  interpret  the  signals  in  the 
same  way),  then  their  posteriors  will  eventually  disagree,  and  moreover,  they  will  both  attach 
probability  1  to  the  event  that  their  beliefs  will  eventually  diverge.  Put  differently,  this  implies 
that  there  is  "agreement  to  eventually  disagree"  between  the  two  individuals,  in  the  sense  that 
they  both  believe  ex  ante  that  after  observing  the  signals  they  will  fail  to  agree.  This  feature 
will  play  an  important  role  in  the  applications  in  Section  4  below. 

Remark  1  The  assumption  that  Fg  —  Fg  in  this  theorem  is  adopted  for  simplicity.  Even  in 
the  absence  of  this  condition,  there  will  typically  be  no  asymptotic  agreement.  Theorem  6  in 
the  next  section  generalizes  this  theorem  to  a  situation  with  multiple  states  and  multiple  signals 
and  also  dispenses  with  the  assumption  that  Fg  —  Fg.  It  establishes  that  the  set  of  priors  and 
subjective  probability  distributions  that  leads  to  asymptotic  agreement  is  of  "measure  zero". 

Remark  2  Assumption  1  is  considerably  stronger  than  necessary  for  Theorem  3  and  is  adopted 
only  for  simplicity.  It  can  be  verified  that  for  lack  of  asymptotic  learning  it  is  sufficient  (but  not 
necessary)  that  the  measures  generated  by  the  distribution  functions  F\'{p)  and  Fg  (1  —  p)  be 
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absolutely  continuous  with  respect  to  each  other.  Similarly,  for  lack  of  asymptotic  agreement, 
it  is  sufficient  (but  not  necessary)  that  the  measures  generated  by  F\  (p),  F^  (1  —  p),  F^  (p) 
and  Fq  (1  -  p)  be  absolutely  continuous  with  respect  each  other.  For  example,  if  both  indi- 
viduals beheve  that  pA  is  either  0.3  or  0.7  (with  the  latter  receiving  greater  probability)  and 
that  pb  is  also  either  0.3  or  0.7  (with  the  former  receiving  greater  probability),  then  there  will 
be  neither  asymptotic  learning  nor  asymptotic  agreement.  Throughout  we  use  Assumption  1 
both  because  it  simplifies  the  notation  and  because  it  is  a  natural  assumption  when  we  turn 
to  the  analysis  of  asymptotic  agreement  under  approximate  certainty  below. 

Towards  proving  the  above  theorems,  we  now  introduce  some  notation,  which  will  be  used 
throughout  the  paper.  Recall  that  the  sequence  of  signals,  s,  is  generated  by  an  exchangeable 
process,  so  that  the  order  of  the  signals  does  not  matter  for  the  posterior.  Let 

rn{s)  =  H^{t<  n\st  =  a) 

be  the  number  of  times  St  =  a  out  of  first  n  signals.^  By  the  strong  law  of  large  numbers, 
r-n  (s)  /n  converges  to  some  p  (s)  e  [0, 1]  almost  surely  according  to  both  individuals.  Defining 
the  set 

S  =  {s  e  S  :  lim„-,oo  Tn  (s)  /n  exists}  ,  (1) 

this  observation  implies  that  Pr'  (s  G  S")  =  1  for  i  =  1, 2.  We  will  often  state  our  results  for  all 
sample  paths  s  in  S,  which  equivalently  implies  that  these  statements  are  true  almost  surely 
or  with  probability  1 .  Now,  a  straightforward  application  of  the  Bayes  rule  gives 

where  Pr'  (r„|6')  is  the  probability  of  observing  the  signal  St  =  a  exactly  r„  times  out  of  n 
signals  with  respect  to  the  distribution  Fq.  The  next  lemma  provides  a  very  useful  formula  for 
(j!)'^  (s)  =  lim„^oo  i'li  (■5)  for  all  sample  paths  s  in  S. 


^ Given  the  definition  of  r^  {s\,  the  probabihty  distribution  Pr'  on  {A,  B}  x  S  \s 
Pr'  (e^'"''^'^      -     ""'  [  P""*'*  (^  "  P)""""*"'  Ja  (p)  dp,  and 
Pr'  (b^'^-)      =      (1  -  Tr')  I'  (1  -  p)'-"^^'  p-^-'^'/b  (p)  dp 
at  each  event  e"'"'"  =  {(9,s')  \s't  =  st  for  each  t  <  n},  where  s  =  {st}^i  and  s'  =  {sjjj^j. 
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Lemma  1  Suppose  Assumption  1  holds.   Then  for  all  s  E  S, 


<Pl  [p  (s))  =  lim  4  [s)  = }  (3) 


where  p  (s)  =  lim„_»oo  ^n  (s)  /n,  and  Wp  G  [0, 1], 


Proof.  Write 

W{rn\e^B)     ^     J^p--{l-pr-'-fB{l-p)dp 
Pr'  irn\e  =  A)  JoP'H^-Pr-'-fA{p)dp 

/q  p''"(l-p)"-'""dp 
~  /n'p'-"(l-p)"-'-"/A(p)dp 

E^lfBJl  -  p)\rn] 

Here,  the  first  equaUty  is  obtained  by  dividing  the  numerator  and  the  denominator  by  the 
same  term.  The  resulting  expression  on  the  numerator  is  the  conditional  expectation  of 
Jb  (1  —  p)  given  r„  under  the  flat  (Lebesgue)  prior  on  p  and  the  Bernoulli  distribution  on 
{sf}"^Q.  Denoting  this  by  E'^[/j3(l  —  p)|r„],  and  the  denominator,  which  is  similarly  defined  as 
the  conditional  expectation  of  Ja  (p),  by  E'**[/4(p)|r„],  we  obtain  the  last  equality.  By  Doob's 
consistency  theorem  for  Bayesian  posterior  expectation  of  the  parameter,  as  rn  -+  p,  we  have 
that  E^[/b(1  -p)|r„]  ^  /s(l  -  p)  and  E^[/A(p)|r„]  ^  /^(p)  (see,  e.g.,  Doob,  1949,  Ghosh 
and  Ramamoorthi,  2003,  Theorem  1.3.2).  This  establishes 

PTHrn\e  =  B)  , 

as  defined  in  (4).  Equation  (3)  then  follows  from  (2).    ■ 

In  equation  (4),  ff  (p)  is  the  asymptotic  likelihood  ratio  of  observing  frequency  p  of  a  when 
the  true  state  is  B  versus  when  it  is  A.  Lemma  1  states  that,  asymptotically,  individual  i  uses 
this  likelihood  ratio  and  Bayes  rule  to  compute  his  posterior  beliefs  about  6. 

An  immediate  implication  of  Lemma  1  is  that  given  any  s  &  S, 

4>l  iP  is))  =  4>lo  (P  is))  if  and  only  if  ^^i?^  [p  (s))  =  ^—^R"  {p  (s)) .  (5) 

The  proofs  of  Theorems  2  and  3  now  follow  from  Lemma  1  and  equation  (5). 
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Proof  of  Theorem  2.  Under  the  assumption  that  Fg  (1/2)  =  0  in  the  theorem,  the  argu- 
ment in  Lemma  1  still  applies,  and  we  have  i?'  {p  (s))  =  0  when  p  (s)  >  1/2  and  K'  {p  (s))  =  oo 
when  p  (s)  <  1/2.  Given  9  —  A,  then  r„  (s)  /n  converges  to  some  p  (s)  >  1/2  almost  surely  ac- 
cording to  both  i  ==  1  and  2.  Hence,  Pr'  {(j)]^  {p  (s))  =  l\e  =  A)  =  Pr'  (<^^  (p  (s))  =  1\0  =  A)  = 
1  for  z  =  1,2.  Similarly,  Pi' {(f)]^  {p  {s))  =  0\e  =  B)  =  Pr' (<^^  (p(s))  =  O|0  =  B)  =  1  for 
I  =  1,2,  establishing  the  second  part.  ■ 

Proof  of  Theorem  3.  Since  fg  (1  -  p{s))  >  0  and  /a  (p(s))  is  finite,  R'  (p(s))  >  0. 
Hence,  by  Lemma  1,  (f>^^  (p(s))  ¥"  1  for  each  s,  establishing  the  first  part.  The  second  part 
follows  from  equation  (5),  since  n^  ^  tv'  and  Fq  =  Fq  implies  that  for  each  s  E  S,  (f)^  (s)  ^ 
rf>l  is),  and  thus  Pr'  {\cf>l  [s)  -  ^^  (s)|  ^  O)  =  1  for  i  =  1,  2.  ■ 

Intuitively,  when  Assumption  1  (in  particular,  the  full  support  feature)  holds,  an  individual 
is  never  sure  about  the  exact  interpretation  of  the  sequence  of  signals  he  observes  and  will 
update  his  views  about  pe  (the  informativeness  of  the  signals)  as  well  as  his  views  about  the 
underlying  state.  For  example,  even  when  signal  a  is  more  likely  in  state  A  than  in  state 
B,  a  very  high  frequency  of  a  will  not  necessarily  convince  him  that  the  true  state  is  A, 
because  he  may  infer  that  the  signals  are  not  as  reliable  as  he  initially  beheved,  and  they  may 
instead  be  biased  towards  a.  Therefore,  the  individual  never  becomes  certain  about  the  state, 
which  is  captured  by  the  fact  that  i?'  (p)  defined  in  (4)  never  takes  the  value  zero  or  infinity. 
Consequently,  as  shown  in  (3),  his  posterior  beliefs  will  be  determined  by  his  prior  beliefs 
about  the  state  and  also  by  R^,  which  tells  us  how  the  individual  updates  his  beliefs  about  the 
informativeness  of  the  signals  as  he  observes  the  signals.  When  two  individuals  interpret  the 
informativeness  of  the  signals  in  the  same  way  (i.e.,  R^  —  R^),  the  differences  in  their  priors 
will  always  be  reflected  in  their  posteriors. 

In  contrast,  if  an  individual  were  sure  about  the  informativeness  of  the  signals  (i.e.,  if  i 
were  sure  that  PA  —  PB  =  P^  for  some  p*  >  1/2)  as  in  Theorem  1,  then  he  would  never 
question  the  informativeness  of  the  signals — even  when  the  limiting  frequency  of  a  converges 
to  a  value  different  from  p'or  1  —  p'.  Consequently,  in  this  case,  for  each  sample  path  with 
p{s)  ^  1/2  both  individuals  would  learn  the  true  state  and  their  posterior  beliefs  would  agree 
asymptotically. 

As  noted  above,  an  important  implication  of  Theorem  3  is  that  there  will  tj^aically  be 
"agreement  to  eventually  disagree"  between  the  individuals.  In  other  words,  given  their  priors, 

13 


both  individuals  will  agree  that  after  seeing  the  same  infinite  sequence  of  signals  they  will  still 
disagree  (with  probability  1).  This  implication  is  interesting  in  part  because  the  common  prior 
assumption,  typically  justified  by  learning,  leads  to  the  celebrated  "no  agreement  to  disagree" 
result  (Aumann,  1976,  1998),  which  states  that  if  the  individuals'  posterior  beliefs  are  common 
knowledge,  then  they  must  be  equal.^  In  contrast,  in  the  limit  of  the  learning  process  here, 
individuals'  beliefs  are  common  knowledge  (as  there  is  no  private  information),  but  they  are 
different  with  probability  1.  This  is  because  in  the  presence  of  uncertainty  and  full  support 
as  in  Assumption  1,  both  individuals  understand  that  their  priors  will  have  an  effect  on  their 
beliefs  even  asymptotically;  thus  they  expect  to  disagree.  Many  of  the  applications  we  discuss 
in  Section  4  exploit  this  feature. 

2.3     Divergence  of  Opinions 

Theorem  3  established  that  the  differences  in  priors  are  reflected  in  the  posteriors  even  in 
the  limit  as  n  — +  oo.  It  does  not,  however,  quantify  the  possible  disagreement  between  the 
two  individuals.  The  rest  of  this  section  investigates  different  aspects  of  this  question.  We 
first  show  that  two  individuals  that  observe  the  same  sequence  of  signals  may  have  diverging 
posteriors,  so  that  common  information  can  increase  disagreement. 

Theorem  4  Suppose  that  subjective  probability  distributions  are  given  by  Fg  and  Fq  that 
satisfy  Assumption  1  and  that  there  exists  e  >  0  such  that  \B}  {p)  —  R'^  {p)\  >  e  for  each 
p  G  [0, 1].   Then,  there  exists  an  open  set  of  priors  n^  and  n'^,  such  that  for  all  s  &  S, 

lim  \4>i{s)-4>l{s)\>y-7r^\; 

n—>(X)  '  III 

in  particular, 


Pr'(lim  \cPi{s)-4>l[s)\>y-n^\)=l. 


Proof.  Fix  Fg  and  Fq  and  take  tt^  =  tt^  =  1/2.  By  Lemma  I  and  the  hypothesis 
that  \R^  (p)  -  i?2  (p)|  >  e  for  each  p  e  [0, 1],  lim„^c)o  \(t>n.  (s)  -  (p^  (s)|  >  e'  for  some  e'  >  0, 
while  \n^  ^  '^'^\  —  ^-  Since  both  expressions  are  continuous  in  n^  and  tt^,  there  is  an  open 
neighborhood  of  1/2  such  that  the  above  inequality  uniformly  holds  for  each  p  whenever  n^ 


Note,  however,  that  the  "no  agreement  to  disagree"  result  derives  from  individuals'  updating  their  beliefs 
because  those  of  others  differ  from  their  own  (Geanakoplos  and  Polemarchakis,  1982),  whereas  here  individuals 
only  update  their  beliefs  by  learning. 
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and  n~  are  in  this  neighborhood.  The  last  statement  follows  from  the  fact  that  Pr'  (s  G  5)  =  1. 
■ 

Intuitively,  even  a  small  difference  in  priors  ensures  that  individuals  will  interpret  signals 
differently,  and  if  the  original  disagreement  is  relatively  small,  after  almost  all  sequences  of 
signals,  the  disagreement  between  the  two  individuals  grows.  Consequently,  the  observation 
of  a  common  sequence  of  signals  causes  an  initial  difference  of  opinion  between  individuals  to 
widen  (instead  of  the  standard  merging  of  opinions  under  certainty).  Theorem  4  also  shows  that 
both  individuals  are  certain  ex  ante  that  their  posteriors  will  diverge  after  observing  the  same 
sequence  of  signals,  because  they  understand  that  they  will  interpret  the  signals  differently. 
This  strengthens  our  results  further  and  shows  that  for  some  priors  individuals  will  "agree  to 
eventually  disagree  even  more" . 

An  interesting  implication  of  Theorem  4  is  also  worth  noting.  As  demonstrated  by  The- 
orems 1  and  2,  when  there  is  learning  under  certainty  individuals  initially  disagree,  but  each 
individual  also  believes  that  they  will  eventually  agree  (and  in  fact,  that  they  will  converge  to 
his  beliefs).  This  implies  that  each  individual  expects  the  other  to  "learn  more".  More  specif- 
ically, let  1b=a  be  the  indicator  function  for  9  —  A  and  A'  =  (tt*  —  Iq^a)  —  ((f)^^  —  10=a) 
be  a  measure  of  learning  for  individual  i,and  let  E'  be  the  expectation  of  individual  i  (under 
the  probability  measure  Pr*).  Under  certainty.  Theorem  1  imphes  that  4>lo  =  'Poo  ~  Ie=A, 
so  that  E'  [A'  -  A^]  =  -  (tt'  -  tt^)^  <  0  and  thus  E'  [A']  <  W  [A^] .  Under  uncertainty,  this 
is  not  necessarily  true.  In  particular.  Theorem  4  implies  that,  under  the  assumptions  of  the 
theorem,  there  exists  an  open  subset  of  the  interval  [0, 1]  such  that  whenever  n^  and  tt^  are  in 
this  subset,  we  have  W  [A']  >  E'  [A-^] ,  so  that  individual  i  would  expect  to  learn  more  than 
individual  j.  The  reason  is  that  individual  i  is  not  only  confident  about  his  initial  guess  tt',  but 
also  expects  to  learn  more  from  the  sequence  of  signals  than  individual  j,  because  he  believes 
that  individual  j  has  the  "wrong  model  of  the  world."  The  fact  that  an  individual  may  expect 
to  learn  more  than  others  will  play  an  important  role  in  some  of  the  applications  in  Section  4. 

2.4     Non-monotonicity  of  the  Likelihood  Ratio 

We  next  illustrate  that  the  asymptotic  likelihood  ratio,  R'  (p),  may  be  non-monotone,  meaning 
that  when  an  individual  observes  a  high  frequency  of  signals  taking  the  value  a,  he  may  conclude 
that  the  signals  are  biased  towards  a  and  may  put  lower  probability  on  state  A  than  he  would 
have  done  with  a  lower  frequency  of  a  among  the  signals.  This  feature  not  only  illustrates  the 
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types  of  behavior  that  are  possible  when  individuals  are  learning  under  uncertainty  but  is  also 
important  for  the  apphcations  we  discuss  in  Section  4. 
Inspection  of  expression  (3)  establishes  the  following: 

Lemma  2  For  any  s  €  5,  </)^  (s)  is  decreasing  at  p  (s)  if  and  only  if  B}  is  increasing  at  p  (s). 

Proof.  This  follows  immediately  from  equation  (3)  above.    ■ 

When  FC  is  non-monotone,  even  a  small  amount  of  uncertainty  about  the  informativeness 
of  the  signals  may  lead  to  significant  differences  in  limit  posteriors.  The  next  example  illus- 
trates this  point,  while  the  second  example  shows  that  there  can  be  "reversals"  in  individuals' 
assessments,  meaning  that  after  observing  a  sequence  "favorable"  to  state  A,  the  individual 
may  have  a  lower  posterior  about  this  state  than  his  prior.  The  impact  of  small  uncertainty 
on  asymptotic  agreement  will  be  more  systematically  studied  in  the  next  subsection. 

Example  1  (Non-monotonicity)  Each  individual  i  thinks  that  with  probability  I  —  e,  pA 
and  pb  are  in  a  (^-neighborhood  of  some  p'  >  (1  -I-  5)  /2,  but  with  probability  e  >  0,  the  signals 
are  not  informative.  More  precisely,  for  p'  >  (1  4-  (5)  /2,  e  >  0  and  5  <  |j3^  —  p^|,  we  have 

P(„^i^+{^-^)/S    ifpe{f-5/2,f  +  6/2) 

^^^^'      \  e  otherwise  ^  ' 

for  each  9  and  i.  Now,  by  (4),  the  asymptotic  likelihood  ratio  is 

TTiffz^y     tfp{s)^{f-5/2,p^  +  8/2) 


R'{p{s))  =  { 


i^^      ^fp{s)e{l-p^-5/2^-p^  +  5/2) 
1  otherwise. 


This  and  other  relevant  functions  are  plotted  in  Figure  1  for  e  — >  0  and  5  — >  0.  The  likelihood 
ratio  FC  {p{s))  is  1  when  p{s)  is  small,  takes  a  very  high  value  at  1  —  p',  goes  down  to  1 
afterwards,  becomes  nearly  zero  around  p\  and  then  jumps  back  to  1.  By  Lemmas  1  and  2, 
</'oo  (s)  will  also  be  non-monotone:  when  p  (s)  is  small,  the  signals  are  not  informative,  thus 
0J,Q  (s)  is  the  same  as  the  prior,  7r\  In  contrast,  around  1  —  p',  the  signals  become  very 
informative  suggesting  that  the  state  is  B,  thus  ^^^  (s)  =  0.  After  this  point,  the  signals 
become  uninformative  again  and  (/>to  (s)  goes  back  to  7r\  Around  p',  the  signals  are  again 
informative,  but  this  time  favoring  state  A,  so  </>^  (s)  ^  1.  Finally,  signals  again  become 
uninformative  and  4>^^[s)  falls  back  to  tt*.   Intuitively,  when  p{s)  is  around  1  —  p'  or  p',  the 
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Figure  1:    The  three  panels  show,  respectively,  the  approximate  values  of  R}  (p),  (f>^^,  and 
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individual  assigns  very  high  probability  to  the  true  state,  but  outside  of  this  region,  he  sticks 
to  his  prior,  concluding  that  the  signals  are  not  informative. 

The  first  important  observation  is  that  even  though  (?!)^  is  equal  to  the  prior  for  a  large 
range  of  limiting  frequencies,  as  e  ^  0  and  (5  — >  0  each  individual  attaches  probability  1  to  the 
event  that  he  will  learn  9.  This  is  because  as  illustrated  by  the  discussion  after  Theorem  1, 
as  e  ^  0  and  5  — >  0,  each  individual  becomes  convinced  that  the  limiting  frequencies  will  be 
either  1  —  j5*  or  p^'. 

However,  asymptotic  learning  is  considerably  weaker  than  asymptotic  agreement.  Each 
individual  also  understands  that  since  5  <  |p^  —  p^|,  when  the  long-run  frequency  is  in  a 
region  where  he  learns  that  9  =  A,  the  other  individual  will  conclude  that  the  signals  are 
uninformative  and  adhere  to  his  prior  belief.  Consequently,  he  expects  the  posterior  beliefs 
of  the  other  individual  to  be  always  far  from  his.  Put  differently,  as  e  — >  0  and  5  — >  0,  each 
individual  believes  that  he  will  learn  the  value  of  9  himself  but  that  the  other  individual  will 
fail  to  learn,  thus  attaches  probability  1  to  the  event  that  they  disagree.  This  can  be  seen  from 
the  third  panel  of  Figure  1;.  at  each  sample  path  in  S,  at  least  one  of  the  individuals  will  fail 
to  learn,  and  the  difference  between  their  limiting  posteriors  will  be  uniformly  higher  than  the 
following  "objective"  bound 


mm 


{n\7T^,l-7r\l 


ttMttI 


-1}- 


When  TF^  =  1/3  and  tt^  —  2/3,  this  bound  is  equal  to  1/3.   In  fact,  the  belief  of  each  indi- 
vidual regarding  potential  disagreement  can  be  greater  than  this;  each  individual  believes 
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that  he  will  learn  but  the  other  individual  will  fail  to  do  so.  Consequently,  for  each  i, 
Pr'  (lim„^oo  \(l>i  (s)  -  <Pl  (s)|  >  Z)  >  I  -  e,  where  as  e  ->  0,  Z  ^  min  {n^n^,  1-7t\1~-  tt^}. 
This  "subjective"  bound  can  be  as  high  as  1/2. 

The  next  example  shows  an  even  more  extreme  phenomenon,  whereby  a  high  frequency  of 
s  =  a  among  the  signals  may  reduce  the  individual's  posterior  that  9  —  A  below  his  prior. 

Example  2     (Reversal)  Now  suppose  that  individuals'  subjective  probability  densities  are 

given  by 

r   (l-e-e2)/(5     ifp^-6/2<p<f  +  S/2 

[  e^  otherwise 

for  each  9  and  i  =  1,2,  where  e  >  0,  p'  >  1/2,  and  0  <  6  <  p^  -  p'^.  Clearly,  as  e  — >  0,  (4) 
gives: 


ff(p(5))- 


tfp{s)<l-f-5/2, 
0       orl-f  +  5/2<pis)<l/2, 
or  f  -5/2<p  (s)  <f  +  (5/2 


oo     otherwise. 
Hence,  the  asymptotic  posterior  probability  that  9  =  A'lS 


<t^oo{p{s))^{ 


ifp{s)<l-f-6/2, 
1      or  1  -  p'  +  S/2  <  p{s)  <  1/2, 
orf  -S/2  <  p{s)  <f-  +  5/2 


0     otherwise. 

Consequently,  in  this  case  observing  a  sufficiently  high  frequency  of  s  =  a  may  reduce  the 
posterior  that  9  =  A  below  the  prior.  Moreover,  the  individuals  assign  probability  1  —  e  that 
there  will  be  extreme  asymptotic  disagreement  in  the  sense  that  \4>]^  (p  (s))  —  (?i^  (p  (s))  |  ^  1. 

In  both  examples,  it  is  crucial  that  the  likelihood  ratio  R}  is  not  monotone.  If  i?'  were 
monotone,  at  least  one  of  the  individuals  would  expect  that  their  beliefs  will  asymptotically 
agree.  To  see  this,  take  p*  >  fP  ■  Given  the  form  of  R^  (p),  individual  i  is  almost  certain  that, 
when  the  state  is  A,  p{s)  will  be  close  to  p'.  He  also  understands  that  j  would  assign  a  very 
high  probability  to  the  event  that  9  —  A  when  p  (s)  =  p'  >  p'.  If  R^  were  monotone,  individual 
j  would  assign  even  higher  probability  to  yl  at  p  (s)  =  p'  and  thus  his  probability  assessment 
on  A  would  also  converge  to  1  as  e  — >  0.  Therefore,  in  this  case  i  will  be  almost  certain  that  j 
will  learn  the  true  state  and  that  their  beliefs  will  agree  asymptotically. 
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Theorem  1  established  that  there  will  be  asymptotic  agreement  under  certainty.  One  might 
have  thought  that  as  e  — +  0  and  uncertainty  disappears,  the  same  conclusion  would  apply.  In 
contrast,  the  above  examples  show  that  even  as  each  F^  converges  to  a  Dirac  distribution  (that 
assigns  a  unit  mass  to  a  point),  there  may  be  significant  asymptotic  disagreement  between  the 
two  individuals.  Notably  this  is  true  not  only  when  there  is  negligible  uncertainty,  i.e.,  e  — >  0 
and  (5  — +  0,  but  also  when  the  individuals'  subjective  distributions  are  nearly  identical,  i.e.,  as 
p^  —  p^  — +  0  .  This  suggests  that  the  result  of  asymptotic  agreement  in  Theorem  1  may  not 
be  a  continuous  limit  point  of  a  more  general  model  of  learning  under  uncertainty.^  Instead, 
we  will  see  in  the  next  subsection  that  whether  or  not  there  is  asymptotic  agreement  under 
approximate  certainty  (i.e.,  as  Fg  becomes  more  and  more  concentrated  around  a  point)  is 
determined  by  the  tail  properties  of  the  family  of  distributions  Fg . 

2.5     Agreement  cind  Disagreement  with  Approximate  Certainty 

In  this  subsection,  we  characterize  the  conditions  under  which  "approximate  certainty"  ensures 
asymptotic  agreement.  More  specifically,  we  will  study  the  behavior  of  asymptotic  beliefs  as 
the  subjective  probability  distribution  Fg  converges  to  a  Dirac  distribution  and  the  uncertainty 
about  the  interpretation  of  the  signals  disappears.  As  already  illustrated  by  Example  1,  as 
Fq  converges  to  a  Dirac  distribution,  each  individual  will  become  increasingly  convinced  that 
he  will  learn  the  true  state.  However,  because  asymptotic  agreement  is  considerably  more 
demanding  than  asymptotic  learning,  this  does  not  guarantee  that  the  individuals  will  believe 
that  they  will  also  agree  on  6.  We  will  demonstrate  that  whether  or  not  there  is  asymp- 
totic agreement  in  the  limit  depends  on  the  family  of  distributions  converging  to  certainty — in 
particular,  on  their  tail  properties.  For  many  natural  distributions,  a  small  amount  of  uncer- 
tainty about  the  informativeness  of  the  signals  is  sufficient  to  lead  to  significant  differences  in 
posteriors. 

To  state  and  prove  our  main  result  in  this  case,  consider  a  family  of  subjective  probability 
density  functions  fg^  for  i  =  1,2,  9  £  {^,-B}  and  m  €  Z+,  such  that  as  m  ^  cxd,  we  have 
that  Fg.^  —^Fg^  where  Fg^  assigns  probability  1  to  p  =  p'  for  some  p'  e  (1/2, 1).  Naturally, 
there  are  many  different  ways  in  which  a  family  of  subjective  probability  distributions  may 


'Nevertheless,  it  is  also  not  the  case  that  asymptotic  agreement  under  approximate  certainty  requires  the 
support  of  the  distribution  of  each  Fg  to  converge  to  a  set  as  in  Theorem  2  (that  does  not  assign  positive 
probability  to  p\  <  1/2).  See  Theorem  5  below. 
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converge  to  such  a  limiting  distribution.  Both  for  tractabihty  and  to  make  the  analysis  more 
concrete,  we  focus  on  families  of  subjective  probability  distributions  <  /^  ^^  >  parameterized  by 
a  determining  density  function  /.  We  impose  the  following  conditions  on  /: 

(i)  /  is  symmetric  around  zero; 
(ii)  there  exists  x  <  oo  such  that  /  [x]  is  decreasing  for  all  x>x\ 

(iii) 

R{x,y)^    lim   (j^  (7) 

m-»oo  j  [my) 

exists  in  [0,  oo]  at  all  (x,  y)  e  K^.^° 

Conditions  (i)  and  (ii)  are  natural  and  serve  to  simplify  the  notation.  Condition  (iii) 
introduces  the  function  R  (x,  y),  which  will  arise  naturally  in  the  study  of  asymptotic  agreement 
and  has  a  natural  meaning  in  asymptotic  statistics  (see  Definitions  1  and  2  below). 

In  order  to  vary  the  amount  of  uncertainty,  we  consider  mappings  of  the  form  x  i— » 
(x  —  y)  /m,  which  scale  down  the  real  line  around  y  by  the  factor  1/m.  The  family  of  subjective 
densities  for  individuals'  beliefs  about  pA  and  pB,  {fom}'  ^^^'  ^^  determined  by  /  and  the 
transformation  x  i-^  [x  —  p')  /m}^  In  particular,  we  consider  the  following  family  of  densities 

fl^ip)  =  d{m)f{m{p-f))  (8) 

for  each  9  and  i  where  c'  (m)  =  1/  Jq  f  [m  [p  —  p^))  dp  is  a  correction  factor  to  ensure  that 
fff^isa  proper  probability  density  function  on  [0, 1]  for  each  m.  We  also  define  (f)]^  ^  = 
hm„->oo  (Pn^m  (^)  ^  the  limiting  posterior  distribution  of  individual  i  when  he  believes  that  the 
probability  density  of  signals  is  fg^-  In  this  family  of  subjective  densities,  the  uncertainty 
about  Pa  is  scaled  down  by  1/m,  and  /^^  converges  to  unit  mass  at  p'  as  m  ^  oo,  so  that 
individual  i  becomes  sure  about  the  informativeness  of  the  signals  in  the  limit.  In  other  words, 
as  m  — >  oo,  this  family  of  subjective  probability  distributions  leads  to  approximate  certainty 
(and  ensures  asymptotic  learning;  see  the  proof  of  Part  1  of  Theorem  5). 


'°  Convergence  will  be  uniform  in  most  cases  in  view  of  the  results  discussed  following  Definition  1  below 
(and  of  Egorov's  Theorem,  which  links  pointwise  convergence  of  a  family  of  functions  to  a  limiting  function  to 
uniform  convergence,  see,  for  example,  Billingsley,  1995,  Section  13). 

'^This  formulation  assumes  that  p\  and  pg  are  equal.  We  can  easily  assume  these  to  be  different,  but  do  not 
introduce  this  generality  here  to  simplify  the  exposition.  Theorem  8  allows  for  such  differences  in  the  context 
of  the  more  general  model  with  multiple  states  and  multiple  signals. 
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The  next  theorem  characterizes  the  class  of  determining  functions  /  for  which  the  resulting 
famil}'  of  the  subjective  densities  s  /^  ^jj  [  leads  to  asymptotic  agreement  under  approximate 
certainty. 

Theorem  5  Suppose  that  Assumption  1  holds.  For  each  i  —  1,2,  consider  the  family  of 
subjective  densities  I  fgmf  "^e/zned  in  (8)  for  some  p*  >  1/2,  with  f  satisfying  conditions  (i)- 
(iii)  above.  Suppose  that  f  [mx)  / f  {my)  uniformly  converges  to  R{x,  y)  over  a  neighborhood 
of  {p^  +  p^  —  1,1?^  —  P^\)-   Then, 

1.  lim„^^oo  (</'j>o,m  (P')  -  <PL,m,  (P'))  =  0  ^/  ^"^^  °^^y  ^f  ^  {P^  +  P^  -  1,  \f  -  P^l)  =  0. 

2.  Suppose  that  R  (p^  +p'^  —  1,  \p^  —  p^\)  —  0.    Then  for  every  e  >  0  and  S  >  0,  there  exists 
rh  G  Z-i-  such  that 

Pr'  (  lim  l^;,,^,  (5)  -  4>l,m  {s)\>e)<S         (Vm  >  m,i  =  1,2). 


3.  Suppose  that  R  (^p^  +  p^  —  l,\p^  —  p'^\)  ^0.    Then  there  exists  e  >  0  such  that  for  each 
5  >  0,  there  exists  ttt.  e  Z+  such  that: 

Pr'flim  \cl>l^is)-cf>l^{s)\>e)  >l-5         (Vm  >  m,2  =  1,2). 

Proof.  (Proof  of  Part  1)  Let  R!^  (p)  be  the  asymptotic  likelihood  ratio  as  defined  in 
(4)  associated  with  subjective  density  fg  ^.  One  can  easily  check  that  limm->oo  Rm  (pO  ~  '^• 
Hence,  by  (5),  lim^^oo  {(Plc,m  (p')  -  ^,m  (p'))  "=  ^  if  and  only  if  lim^^oo  RL  (p')  =  0.  By 
definition,  we  have: 

m-^oo  m-^oo      fym-ypp)) 

=    R{l-p'-p\f-p^) 

=      i?(pl+p2_l_|pl_p2Q^ 

where  the  last  equality  follows  by  condition  (i),  the  symmetry  of  the  function  /.  This  establishes 

that  lim,„-^oo  Rln  (p')  =  0  (and  thus  lim^^oo  ('/''oo.m  {f)  -  </4>,m  {?'))  =  0)  if  and  only  if 
.R(pi+p2-l,|pi-p2|)  =0. 

(Proof  of  Part  2)  Take  any  e  >  0  and  6  >  0,  and  assume  that  R  [p^  +  p^  —  1,  \f?-  —  p^|)  =  0. 
By  Lemma  1,  there  exists  e'  >  0  such  that  </)Jx3.m  (P  i^))  >  1  —  e  whenever  ff  [p  (s))  <  e'.  There 

21 


also  exists  xo  such  that 


/Xo 
f{x)dx>l-  5. 
Xo 


(9) 


Let  K,  =  min^-gr  ^u^j;;,]  /  (x)  >  0.   Since  /  monotonically  decreases  to  zero  in  the  tails  (see  (ii) 
above),  there  exists  xi  such  tl>at  /  {x)  <  c'k  whenever  \x\  >  \xi\.  Let  mi  =  (xq  +  xi)  /  (2p^  ~^)  > 
0.  Then,  for  any  m  >  mi  and  p{s)  G  (p'  —  xo/m,p'  +  xo/m),  we  have  \p  (s)  —  1  +p*|  >  xi/m, 
and  hence 

Therefore,  for  all  m  >  mi  and  p{s)  £  {p^  —  xo/m,p'  +  xo/m),  we  have  that 

0L,m(p(s))>l-e-  (10) 

Again,  by  Lemma  1,  there  exists  e"  >  0  such  that  4^^^^  iP  (*))  >  ^~^  whenever  Bin  {p  (s))  < 
e".  Now,  for  each  p{s), 

\im  B?^{p{s))  =  R{p{s)+f-l,\p{s)-f\).  (11) 

m— >oo  '  " 

Moreover,  by  the  uniform  convergence  assumption,  there  exists  77  >  0  such  that  Bin  {p  (s)) 
uniformly  converges  to  B[p{s)  +  fP  -  \,\p{s)  —  fp\)  on  (p'  —  T],p^  +  r))  and 

B{p{s)+p'-l,\p{s)-f\)<e"/2 

for  each  p  (s)  in  (p'  —  ri,p^  +t]).  Moreover,  uniform  convergence  also  implies  that  B  is  contin- 
uous at  (p^  +  p^  -  1,  |p^  ~  P^l)  (ai^d  in  this  part  of  the  proof,  by  hypothesis,  it  takes  the  value 
0).  Hence,  there  exists  m,2  <  00  such  that  for  all  m  >  m.^  and  p  (s)  6  (p'  —  r],p'^  +  r]), 

Bi,  [p  (s))  <B{p  [s)  +fP-l,  \p  {s)-fP\)+  e"/2  <  e". 

Therefore,  for  all  m  >  m,2  and  p  [s]  £  (p'  —  T],p^  +  rj),  we  have 

(t^oo,m{p{s))>l-e.  (12) 

Set  m  s  max{7Tii,m2,77/xo}.  Then,  by  (10)  and  (12),  for  any  m  >  fh  and  p{s)  e 
(p' -  xo/m,p' +xo/m),  we  have  \<t>lc^m  iP  (s))  -  (pi^^m  iP  is))\  <  £■  Then,  (9)  implies  that 
Pr^  (|(^'^^„  ip{s))  -  <jP^^^^  {p{s))\  <t\e^A)  >  I  -  S.  By  the  symmetry  of  A  and  B,  this 
establishes  that  Pr*  {\<Plo,m.  (p  (•s))  -  <Pio,m  (pi^))  |  <  e)  >  1  -  (^  for  m  >  m. 
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(Proof  of  Part  3)  Since  linim—oo-Rm  (p*)  =  R[p^  ^p^  -  1,  \p^  ~P^|)  is  assumed  to  be 
strictly  positive,  Vimm-^oo  4>io,m  (f)  <  1-  We  set  e  =  (l  -  limm-.oo  <?!4),m  (P*)) /2  and  use 
similar  arguments  to  those  in  the  proof  of  Part  2  to  obtain  the  desired  conclusion.    ■ 

Theorem  5  provides  a  complete  characterization  of  the  conditions  under  which  approximate 
certainty  will  lead  to  asymptotic  agreement.  In  particular,  it  shows  that,  while  approximate 
certainty  ensures  asymptotic  learning,  it  may  not  be  sufficient  to  guarantee  asymptotic  agree- 
ment. This  contrasts  with  the  result  in  Theorems  1  that  there  will  always  be  asymptotic 
agreement  under  full  certainty.  Theorem  5,  instead,  shows  that  even  a  small  amount  of  uncer- 
tainty may  be  sufficient  to  cause  disagreement  between  the  individuals. 

The  first  part  of  the  theorem  provides  a  simple  condition  on  the  tail  of  the  distribution 
/  that  determines  whether  the  asymptotic  difference  between  the  posteriors  is  small  under 
approximate  uncertainty.  This  condition  can  be  expressed  as: 

R{p'+f-l,p'-p')^    lim       \,      ,./  -.2^^       -"•  13 

The  theorem  shows  that  if  this  condition  is  satisfied,  then  as  uncertainty  about  the  informa- 
tiveness  of  the  signals  disappears  the  difference  between  the  posteriors  of  the  two  individuals 
will  become  negligible.  Notice  that  condition  (13)  is  symmetric  and  does  not  depend  on  i. 

Intuitively,  condition  (13)  is  related  to  the  beliefs  of  one  individual  on  whether  the  other 
individual  will  learn.  Under  approximate  certainty,  we  always  have  that  limm->oo  R\n  (p')  —  0; 
so  that  each  agent  believes  that  he  will  learn  the  value  of  9  with  probability  1.  Asymptotic 
agreement  (or  lack  thereof)  depends  on  whether  he  also  believes  the  other  individual  will  learn 
the  value  of  0.  When  R  {p^  +fP'~l,  |p^  ~P^|)  =  0;  an  individual  who  expects  a  limiting 
frequency  of  p^  in  the  asymptotic  distribution  will  still  learn  the  true  state  when  the  hmiting 
frequency  is  p^ .  Therefore,  individual  1,  who  is  almost  certain  that  the  limiting  frequency  will 
be  p^ ,  still  believes  that  individual  2  will  reach  the  same  inference  as  himself.  In  contrast, 
when  R(j)^  +  f?  —  l,\p^  —  fP'\)  ^  0,  individual  1  is  still  certain  that  limiting  frequency  of 
signals  will  be  p^  and  thus  expects  to  learn  himself.  However,  he  understands  that,  when 
R  (p^  +  J?  —  \,  \p^  —  p^l)  ^0,  an  individual  who  expects  a  limiting  frequency  of  p^  will  fail  to 
learn  the  true  state  when  limiting  frequency  happens  to  be  p^.  Since  he  is  almost  certain  that 
the  limiting  frequency  will  be  p^  (or  1  —  p^),  he  expects  the  other  agent  not  to  learn  the  truth 
and  thus  he  expects  the  disagreement  between  them  to  persist  asymptotically. 

Parts  2  and  3  of  the  theorem  then  exploit  this  result  and  the  continuity  of  R  to  show  that 
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the  individuals  will  attach  probability  1  to  the  eveift  that  the  asymptotic  difference  between 
their  beliefs  will  disappear  when  (13)  holds,  and  they  will  attach  probability  1  to  asymptotic 
disagreement  when  (13)  fails  to  hold.  Thus  the  behavior  of  asymptotic  beliefs  under  approxi- 
mate certainty  are  completely  determined  by  condition  (13). 

Theorem  5  establishes  t^at  whether  or  not  there  will  be  asymptotic  agreement  depends 
on  whether  R  (p^  +p^  —  1,  \p^  ~  P^])  i^  equal  to  0.  We  next  investigate  what  this  condition 
means  for  determining  distributions  /.  Clearly,  this  will  depend  on  the  tail  behavior  of  /, 
which,  in  turn,  determines  the  behavior  of  the  family  of  subjective  densities  ifg^}-  Suppose 
X  =  p^  +  p'^  —  1  >  p^  —  p^  =  y  >  0.  Then,  condition  (13)  can  be  expressed  as 

lim   41^  =  0. 
m— >oo  /  [my) 

This  condition  holds  for  distributions  with  exponential  tails,  such  as  the  exponential  or  the 

normal  distributions.   On  the  other  hand,  it  fails  for  distributions  with  polynomial  tails.  For 

example,  consider  the  Pareto  distribution,  where  /  (x)  is  proportional  to  |xp"  for  some  a  >  1. 

Then,  for  each  m, 

fjmx)  ^  (^\~°'  >o 

This  implies  that  for  the  Pareto  distribution,  individuals'  beliefs  will  fail  to  converge  even  when 
there  is  a  negligible  amount  of  uncertainty.  In  fact,  for  this  distribution,  the  asymptotic  beliefs 
will  be  independent  of  m  (since  R\^  does  not  depend  on  m).  If  we  take  tt^  =  tt^  =  1/2,  then 
the  asymptotic  posterior  probability  oi  d  =  A  according  to  i  is 

,,:        (    (  \\- (p(s)-p')"" 

[p{s)-p')       +(p(s)+p'-l) 
for  any  m. 

As  illustrated  in  Figure  2,  in  this  case  (p^^  ^  is  not  monotone  (in  fact,  the  discussion  in 
the  previous  subsection  explained  why  it  had  to  be  non-monotone  for  asymptotic  agreement 
to  breakdown).  To  see  the  magnitude  of  asymptotic  disagreement,  consider  p  (s)  =  p'.  In  that 
case,  (/>^„,(p(s))  is  approximately  1,  and  <Pio,m  (P  i^))  is  approximately  y~"/ (x~" -f  j/~°). 
Hence,  both  individuals  believe  that  the  difference  between  their  asymptotic  posteriors  will  be 

™— a 


I'rcxj.m        r'oo.Tnl 


C-"  -I- J/~" 

This  asymptotic  difference  is  increasing  with  the  difference  y  =  p^  ~  P'^,  which  corresponds  to 
the  difference  in  the  individuals'  views  on  which  frequencies  of  signals  are  most  likely.    It  is 
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Figure  2:  limn^oo  4>n  i^)  fo^  Pareto  distribution  as  a  function  of  p  (s)  \a  —  2,  p^  =  3/4.] 

also  clear  from  this  expression  that  this  asymptotic  difference  will  converge  to  zero  as  y  ^  0 
(i.e.,  as  p^  — >  p'^).  This  last  statement  is  indeed  generally  true  when  R  is  continuous: 

Proposition  1   In  Theorem  5,  in  addition,  assume  that  R  is  continuous  on  the  set 

D  =  {{x,y)\  —  1  <  X  <  l,\y\  <  y}  for  some  y  >  0.   Then  for  every  e  >  0  and  5  >  0,  there  exist 

A  >  0  and  m  e  (0,  cx))  such  that  whenever  \p^  —  p^\  <  A, 


Pr'     lim 


>e     <<5 


(Vm  >  m,i  =  1,2). 


Proof.  To  prove  this  proposition,  we  modify  the  proof  of  Part  2  of  Theorem  5  and  use  the 
notation  in  that  proof.  Since  R  is  continuous  on  the  compact  set  D  and  R  {x,  0)  =  0  for  each 
X,  there  exists  A  >  0  such  that  R  (p^  +p^  —  1,  \p^  ~P^|)  <  ^"/^  whenever  \p^  ~  P^\  <  ^-  Fix 
any  such  p^  and  p^.  Then,  by  the  uniform  convergence  assumption,  there  exists  rj  >  0  such 
that  R^i  {p (s))  uniformly  converges  to  R  (p (s)  +  p*  —  1,  \p{s)  -  p' |)  on  (p*  —  r],p'^  +  rj)  and 

Ri^p[s)+p'  -l,\p{s)-fP\)  <e"l2 

for  each  p  (s)  in  (p'  —  ri,p^  +  tj).   The  rest  of  the  proof  is  identical  to  the  proof  of  Part  2  in 
Theorem  5.    ■ 

This  proposition  impHes  that  if  the  individuals  are  almost  certain  about  the  informativeness 
of  signals,  then  any  significant  difference  in  their  asymptotic  beliefs  must  be  due  to  a  significant 
difference  in  their  subjective  densities  regarding  the  signal  distribution  (i.e.,  it  must  be  the  case 
that  \p^  —  p^l  is  not  small).  In  particular,  the  continuity  of  R  in  Proposition  1  implies  that 
when  p^  =  p^,  we  must  have  R  (p^  +p^  —  1,  \p^  —  p'^j)  —  0,  and  thus,  from  Theorem  5,  there 

25 


will  be  no  significant  differences  in  asymptotic  beliefs.  Notably,  however,  the  requirement  that 
pi  =  p2  jg  rather  strong.  For  example.  Theorem  1  established  that  mider  certainty  there  will 
be  asymptotic  agreement  for  all  p^jP^  >  1/2. 

It  is  also  worth  noting  that  the  assumption  that  R  or  lirn^^o  Rm  (p)  ^^  continuous  in  the 
relevant  range  is  important  for  the  results  in  Proposition  1.  In  particular,  recall  that  Example  1 
illustrated  a  situation  in  which  this  assumption  failed  and  the  asymptotic  differences  remained 
bounded  away  from  zero,  irrespective  of  the  gap  between  p^  and  p'^. 

We  next  focus  on  the  case  where  p^  ^  p^  and  provide  a  further  characterization  of  which 
classes  of  determining  functions  lead  to  asymptotic  agreement  under  approximate  certainty. 
We  first  define: 

Definition  1  A  density  function  f  has  regularly- varying  tails  if  it  has  unbounded  support  and 
satisfies 


Mm   l^  =  H{x)  e  : 


for  any  x  >  0. 


The  condition  in  Definition  1  that  H  (x)  e  M  is  relatively  weak,  but  nevertheless  has 
important  implications.  In  particular,  it  implies  that  H{x)  =  x""  for  a  £  (0,  oo).  This  follows 
from  the  fact  that  in  the  limit,  the  function  H  (•)  must  be  a  solution  to  the  functional  equation 
H{x)H{y)  =  H{xy),  which  is  only  possible  if  H{x)  =  x""'  for  a  G  (0,00).-^^  Moreover,  Seneta 
(1976)  shows  that  the  convergence  in  Definition  1  holds  locally  uniformly,  i.e.,  uniformly  for 
X  in  any  compact  subset  of  (0,oo).  This  implies  that  if  a  density  /  has  regularly-varying 
tails,  then  the  assumptions  imposed  in  Theorem  5  (in  particular,  the  uniform  convergence 
assumption)  are  satisfied.  In  fact,  we  have  that,  in  this  case,  R  defined  in  (7)  is  given  by  the 
same  expression  as  for  the  Pareto  distribution, 

■  X ' 


R{x.y) 

\y 

and  is  everywhere  continuous.  As  this  expression  suggests,  densities  with  regularly- varying  tails 
behave  approximately  like  power  functions  in  the  tails;  indeed  a  density  /  (x)  with  regularly- 
varying  tails  can  be  written  as  /(x)  =  £(x)x~"  for  some  slowly-varying  function  C  (with 


.  '"To  see  this,  note  that  since  limm^oo  {f{mx)/f{m))  =  H  (x)  £  R,  we  have 

//(.y)=    lim    (lp^)=    lim    ffe)^)  =//(.)  ff(,) . 
^-^00  \    f(m)    J       Tn-.^\f(my)    f  {m)  J 

See  de  Haan  (1970)  or  Feller  (1971). 
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limm->oo 'C(mx)/£  (m)  =  1).   Many  common  distributions,  including  the  Pareto,  log-normal, 
and  t-distributions,  have  regularly-varying  densities.  We  also  define: 

Definition  2  A  density  function  f  has  rapidly- varying  tails  if  it  satisfies 

0     if    x>l 


lim  4t^  -  x-°°  =  <     1     if    x  =  l 

"    "   / (m)  .  ,        ^  , 

•'  ^    ^  '    oo    if    X  <  I 


for  any  x  >  0. 


As  in  Definition  1,  the  above  convergence  holds  locally  uniformly  (uniformly  in  x  over  any 
compact  subset  that  excludes  1).  Examples  of  densities  with  rapidly- varying  tails  include  the 
exponential  and  the  normal  densities. 

Prom  these  definitions,  the  following  corollary  to  Theorem  5  is  immediate  and  links  asymp- 
totic agreement  under  approximate  certainty  to  the  tail  behavior  of  the  determining  density 
function. 

Corollary  1  Suppose  that  Assumption  1  holds  and  p^  y^P^- 

1.  Suppose  that  in  Theorem  5  f  has  regularly-varying  tails.    Then  there  exists  e  >  0  such 
that  for  each  5  >  0,  there  exists  rh  ^J^j^  such  that 

Pr'fhm  |(^^,„(s)-</.2      (s)|>e)  >1_<5         (Vm  >  m,i  =  1,  2). 

2.  Suppose  that  in  Theorem  5  f  has  rapidly-varying  tails.   Then  for  every  e  >  0  and  5  >  0, 
there  exists  fh  S  1^+  such  that 

W  (  lim  |</.^,„  (s)  -  <„  {s)\>e)<5         (Vm  >  m,i  =  1,2). 

This  corollary  therefore  implies  that  whether  there  will  be  asymptotic  agreement  depends 
on  whether  the  family  of  subjective  densities  converging  to  "certainty"  has  regularly  or  rapidly- 
varying  tails  (provided  thatp-^  ^  p^). 

Returning  to  the  intuition  above.  Corollary  1  and  the  previous  definitions  make  it  clear 
that  the  failure  of  asymptotic  agreement  is  related  to  disagreement  between  the  two  individuals 
about  limiting  frequencies,  i.e.,  p^  ^  fP,  together  with  sufficiently  thick  tails  of  the  subjective 
probability  distribution  so  that  an  individual  who  expects  p^  should  have  sufficient  uncertainty 
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when  confronted  with  a  Umiting  frequency  of  p^ .  Along  the  lines  of  the  intuition  given  there, 
this  is  sufficient  for  both  individuals  to  believe  that  they  will  learn  the  true  value  of  Q  themselves, 
but  that  the  other  individual  will  fail  to  do  so.  Rapidly-varying  tails  imply  that  individuals 
become  relatively  certain  of  their  model  of  the  world  and  thus  when  individual  i  observes  a 
limiting  frequency  p  close  to  jDut  different  from  p',  he  will  interpret  this  as  driven  by  sampling 
variation  and  attach  a  high  probability  io  9  ~  A.  This  will  guarantee  asymptotic  agreement 
between  the  two  individuals.  In  contrast,  with  regularly-varying  tails,  even  under  approximate 
certainty,  limiting  frequencies  different  from  p'  will  be  interpreted  not  as  a  sampling  variation, 
but  as  potential  evidence  for  6  =  B,  preventing  asymptotic  agreement. 

3      Generalizations 

The  previous  section  provided  our  main  results  in  an  environment  with  two  states  and  two 
signals.  In  this  section,  we  show  that  our  main  results  generalize  to  an  environment  with  K  >  2 
states  and  L  >  K  signals.  The  main  results  parallel  those  of  Section  2,  and  all  the  proofs  for 
this  section  are  contained  in  the  Appendix. 

To  generalize  our  results  to  this  environment,  let  9  E  Q,  where  Q  =  {A^,  ...,A^^  is  a  set 
containing  K  >2  distinct  elements.  We  refer  to  a  generic  element  of  the  set  by  A'^.  Similarly, 
let  St  e  {a-^,  ...,a^},  with  L  >  K  signal  values.  As  before,  define  s  =  {st}^^,  and  for  each 
/  =  1,  ...,L,  let 

r^(s)  =  #{i<n|si  =  a'| 

be  the  number  of  times  the  signal  St  =  a'  out  of  first  n  signals.  Once  again,  the  strong  law  of 
large  numbers  imphes  that,  according  to  both  individuals,  for  each  I  =  1, ...,  L,  r^  (s)  /n  almost 
surely  converges  to  some  p'  (s)  G  [0, 1]  with  YliLi  P^  i^)  ~  ^-  Define  p{s)  e  A  (L)  as  the  vector 
pis)  =  (pi(s),...,p^(s)),  where  A(L)  =  {p  -  (pi, . . .  ,p^)  e  [0, 1]^  :  Eti  p' =  1 } .  and  let 
the  set  S  be 

5  =  <  s  6  5  :  lim„_>oo  ?"i  (s)  /n  exists  for  each  I  =  1, ...,  L  \ .  (14) 

With  analogy  to  the  two-state-two-signal  model  in  Section  2,  let  7r|  >  0  be  the  prior  probability 
individual  i  assigns  to  ^  =  A'',  tt'  =  (ttj,  ...,7r^),  and  pg  be  the  frequency  of  observing  signal 
s  —  a}  when  the  true  state  is  9.  When  players  are  certain  about  pg's  as  in  usual  models, 
immediate  generalizations  of  Theorems  1  and  2  apply.  With  analogy  to  before,  we  define  Fg  as 
the  joint  subjective  probability  distribution  of  conditional  frequencies  p  =  (pg,  ...,p^)  according 
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to  individual  i.  Since  our  focus  is  learning  under  uncertainty,  we  impose  an  assumption  similar 
to  Assumption  1. 

Assumption  2  For  each  i  and  0,  the  distribution  Fq  over  A(L)  has  a  continuous,  non-zero 
and  finite  density  pg  over  A(L). 

This  assumption  can  be  weakened  along  the  lines  discussed  in  Remark  2  above. 
We  also  define  (pl^ni^)  =  P^*  [9  —  A'^  \  {st}"=o)  ^°^  ^^^^  ^  ~     l,...,K  as  the  posterior 
probability  that  6  —  A''  after  observing  the  sequence  of  signals  {st}^^Q,  and 

4,oo(p(s))=   lim  4>l^nis)- 

'  n— >oo        ' 

Given  this  structure,  it  is  straightforward  to  generalize  the  results  in  Section  2.   Let  us  now 
define  the  transformation  T^  :  M.^  -^  M.^~^,  such  that 

Tk{x)=  (^■,k'e{l,...,K}\k 

Here  T^  (x)  is  taken  as  a  column  vector.    This  transformation  will  play  a  useful  role  in  the 
theorems  and  the  proofs.  In  particular,  this  transformation  will  be  apphed  to  the  vector  tt^  of 
priors  to  determine  the  ratio  of  priors  assigned  the  different  states  by  individual  i.  Let  us  also 
define  the  norm  ||x||  —  max;  |2;|'  for  x  —  (x^, . . .  ,x^)  G  M^. 
The  next  lemma  generalizes  Lemma  1: 

Lemma  3  Suppose  Assumption  2  holds.   Then  for  all  s  G  5, 


1 


Ek'^k^lfLApi.^)) 


Our  first  theorem  in  this  section  parallels  Theorem  3  and  shows  that  under  Assumption 
2  there  will  be  lack  of  asymptotic  learning,  and  under  a  relatively  weak  additional  condition, 
there  will  also  asymptotic  disagreement. 

Theorem  6  Suppose  Assumption  2  holds  fori  =  1,2,  then  for  each  k  —   1,  .-.^K,  and  for  each 
1.  VT'{<t^,^{p{s))^l\e  =  A^)  =  l,and 
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2.  Pr^  {\<i>l^  (pis))  -  <Pl^  ipis))\  ^  0)  =  1  wheneverPi^in  {n')-n  {ir''))'n{f{p{s))  = 
0)  =  0  and  F^  =  F^  for  each  6*  e  0. 

The  additional  condition  in  part  2  of  Theorem  6,  that  W{{Tk  [iv^)  -Tk  {TT'^))'Tk{f{p{s))  = 
0)  =  0,  plays  the  role  of  differences  in  priors  in  Theorem  3  (here  "  '  "  denotes  the  transpose 
of  the  vector  in  question).  In  particular,  if  this  condition  did  not  hold,  then  at  some  p(s),  the 
relative  asymptotic  hkelihood  of  some  states  could  be  the  same  according  to  two  individuals 
with  different  priors  and  they  would  interpret  at  least  some  sequences  of  signals  in  a  similar 
manner  and  achieve  asymptotic  agreement.  It  is  important  to  note  that  the  condition  that 
Pr^((Tfc  (tt^)  -  Tfc  (7r2))'Tfc(/'(/9(s))  =  0)  =  0  is  relatively  weak  and  holds  generically— i.e.,  if 
it  did  not  hold,  a  small  perturbation  of  tt^  or  tt^  would  restore  it.^*^  The  Part  2  of  Theorem  6 
therefore  implies  that  asymptotic  disagreement  occmrs  generically. 

The  next  theorem  shows  that  small  differences  in  priors  can  again  widen  after  observing 
the  same  sequence  of  signals. 

Theorem  7  Under  Assumption  2,  assume  1'  [Tk  Ufg  (p))^g0]  -  Tk  f  (/|  (p))^gej  j  ¥=  0  for 
each  p  G  [0, 1],  each  k  —  1,  ■■■,K,  where  1  =  (1, ...,  1)'.  Then,  there  exists  an  open  set  of  prior 
vectors  n^  and  tt-^,  such  that 

\<Pk,oo  iP (s))  -9^fc,oo(p(s))|  >  kfc-7r||  for  eachk  =   l,...,K  and  s  e  S 


and 


Pr'  (kioo  (P  (s))  -  '/'loo  (P(s))|  >  KJ  -  7r||)  =  1  for  each  k  =    1, ...,  K. 


The  condition  1'  ( Tfc  ( (/g  ip))g^Q)  —  Tk  ( (/|  {p))g^Q)  )  7^  0  is  similar  to  the  additional 
condition  in  part  2  of  Theorem  6,  and  as  with  that  condition,  it  is  relatively  weak  and  holds 
generically.  Finally,  the  following  theorem  generalizes  Theorem  5.  The  appropriate  construc- 
tion of  the  families  of  probability  densities  is  also  provided  in  the  theorem. 


'^More  formally,  the  set  of  solutions  S  =  {(7^^7r^p)  €  l\{Lf  :  {Tk  (tt^)  -  Tk  (7r^))'Tfc(/'(p))  =  0}  has 
Lebesgue  measure  0.  This  is  a  consequence  of  the  Preimage  Theorem  and  Sard's  Theorem  in  differential 
topology  (see,  for  example,  Guillemin  and  Pollack,  1974,  pp.  21  and  39).  The  Preimage  Theorem  implies  that 
if  y  is  a  regular  value  of  a  map  /  :  X  — ►  y,  then  f~^{y)  is  a  submanifold  of  X  with  dimension  equal  to 
dimX  —  dim  Y.  In  our  context,  this  implies  that  if  0  is  a  regular  value  of  the  map  {Tk  (tt')  —Tk  (7i""))'Tfc(/'(p)), 
then  the  set  S  is  a  two  dimensional  submanifold  of  A(L)^  and  thus  has  Lebesgue  measure  0.  Sard's  theorem 
implies  that  0  is  generically  a  regular  value. 
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Theorem  8  Suppose  that  Assumption  2  holds.  For  each  6  E  Q  and  m  6  Z_|_,  define  the 
subjective  density  fg  ^  by 

fi^{p)  =  cii,e,m)f{mip-p{i,e)))  (16) 

where  c{i,9,m)  =  '^/  Jp^^in  f  ('7^  (p  - p{i,6)))  dp,  p{i,6)  G  A  (L)  withp{i,9)  ^  p  {i,0')  when- 
ever 9  ^  9' ,  and  f  :  M^  — >  ]R  is  a  positive,  continuous  probability  density  function  that  satisfies 
the  following  conditions: 

(i)  limft^oo  niax|3..||^||>/j}  /  (x)  =  0, 


R{x,y)^   lim   ^^^  (17) 

m^oo  /  (my) 


exists  at  all  x,y,  and 


(Hi)    convergence  in  (1 7)  holds  uniformly  over  a  neighborhood  of  each 
{p{i,9)-p{j,9'),p{i,9)-p{j,9)). 

Also  let  4>k^oo,m  (P i^))  —  l™n^oo  4>knm  i^)  ^^  ^^^  asymptotic  posterior  of  individual  i  with 
subjective  density  fg  ^ .   Then, 


1.  lim^^oo  (fk,oo,m  {P  ih  A^))  -  ^fc,oo,m  {p  (^'^''))  j  =  ^  «/  ""^  Only  if 
R(p{i,A^)-p(^j,A'''yp{i,A'')-p{j,A''))^Oforeachk'y^k. 

2.  Suppose  that  R  [p  {i,  9)  —  p  (j,  9')  ,p{i,9)  —  p  (j,  9))  =  0  for  each  distinct  9  and  9' .   Then 
for  every  e  >  0  and  5  >  0,  there  exists  m  G  Z+  such  that 

W[\\4>l^,m{s)-4L,m{s)\\>e)<5         (Vm>7n,2  =  l,2). 

3.  Suppose  that  R  [p  (i,  9)  —  p  (j,  0')  ,p{i,9)  —  p  {j,  9))  ^  0  for  each  distinct  9  and  9' .   Then 
there  exists  e  >  0  such  that  for  each  5  >  0,  there  exists  m  ^1.+  such  that 

Pr^  (l|0L,m(5)  -  <^^,^(5)1|  >  e)  >  1  -  <5         iym>m,i  =  1,2). 

These  theorems  therefore  show  that  the  results  about  lack  of  asymptotic  learning  and 
asymptotic  agreement  derived  in  the  previous  section  do  not  depend  on  the  assumption  that 
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there  are  only  two  states  and  binary  signals.  It  is  also  straightforward  to  generalize  Proposition 
1  and  Corollary  1  to  the  case  with  multiple  states  and  signals;  we  omit  this  to  avoid  repetition. 
The  results  in  this  section  are  stated  for  the  case  in  which  both  the  number  of  signal  values 
and  states  are  finite.  They  can  also  be  generalized  to  the  case  of  a  continuum  of  signal  values 
and  states,  but  this  introduces  a  range  of  technical  issues  that  are  not  central  to  our  focus 
here. 

4     Applications 

In  this  section  we  discuss  a  number  of  applications  of  the  results  derived  so  far.  The  applications 
are  chosen  to  show  various  different  economic  consequences  from  learning  and  disagreement 
under  uncertainty.  Throughout,  we  strive  to  choose  the  simplest  examples.  The  first  example 
illustrates  how  learning  under  uncertainty  can  overturn  some  simple  insights  from  basic  game 
theory.  The  second  example  shows  how  such  learning  can  act  as  an  equilibrium  selection 
device  as  in  Carlsson  and  van  Damme  (1993).  The  third  example  is  the  most  substantial 
application  and  shows  how  learning  under  uncertainty  affects  speculative  asset  trading.  The 
fourth  example  illustrates  how  learning  under  uncertainty  can  affect  the  timing  of  agreement 
in  bargaining.  Finally,  the  last  example  shows  how  a  special  case  of  our  model  of  learning 
under  uncertainty  can  arise  when  there  is  information  transmission  by  a  potentially  biased 
media  outlet.^"* 

4.1     Value  of  Information  in  Common-Interest  Games 

Consider  a  common-interest  game  in  which  the  players  have  identical  payoff  functions.  Typi- 
cally in  common  interest  games  information  is  valuable  in  the  sense  that  with  more  information 
about  underlying  parameters,  the  value  of  the  game  in  the  best  equihbrium  will  be  higher.  We 
would  therefore  expect  players  to  collect  or  at  least  wait  for  the  arrival  of  additional  informa- 


'^In  this  section,  except  for  the  example  on  equihbrium  selection  and  the  last  example  of  the  game  of  belief 
manipulation,  we  study  complete-information  games  with  possibly  non-common  priors.  Formally,  information 
and  belief  structure  in  these  games  can  be  described  as  follows.  Fix  the  state  space  Q  =  Q  x  S,  and  for  each 
n  <  oo,  consider  the  information  partition  /"  =  {/"  (s)  =  {{d,s')  \s[  =  st^t  <  n}  |s  €  5}  that  is  common  for 
both  players.  For  n  =  oo,  we  introduce  the  common  information  partition  /°°  =  {/°°  (s)  =  9  x  {s}  \s  e  S}. 
At  each  /"  (s),  player  i  =  1,2  assigns  probability  (f>l^  (s)  to  the  state  9  =  A  and  probability  1  —  05,  (s)  to  the 
sate  6  =  B.  Since  the  players  have  a  common  partition  at  each  s  and  n,  their  beliefs  are  common  knowledge. 
Notice  that,  under  certainty,  ip^^  (s)  =  0^  (s)  e  {0, 1},  so  that  after  observing  s,  both  players  assign  probabihty 
1  to  the  same  6.  In  that  case,  there  will  be  common  certainty  of  9,  or  loosely  speaking,  9  becomes  "common 
knowledge."  This  is  not  necessarily  the  case  under  uncertainty. 
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29,29 

1/2,1/2 

1/2,1/2 

1-9,1-9 

tion  before  playing  such  games.  We  now  show  that  when  there  is  learning  under  uncertainty, 
additional  information  can  be  harmful  in  common-interest  games,  and  thus  the  agents  may 
prefer  to  play  the  game  before  additional  information  arrives. 
To  illustrate  these  issues,  consider  the  payoff  matrix 

a  (3 

a 

where  6'  G  {0, 1},  and  the  agents  have  a  common  prior  on  9  according  to  which  probability  of 
^  =  1  is  TT  e  (1/2, 1).  When  there  is  no  information,  a  strictly  dominates  /?  (since  the  expected 
value  of  the  payoff  from  (a,  a)  is  strictly  greater  than  1/2  and  the  expected  value  of  the  payoff 
from  (/3,/3)  is  strictly  less  than  1/2).  In  the  dominant-strategy  equilibrium,  {a,  a),  each  player 
receives  29  with  probability  n,  thus  achieve  an  expected  payoff  of  27r  >  1. 

First,  consider  the  implications  of  learning  under  certainty.  Suppose  that  the  agents  are 
allowed  to  observe  an  infinite  sequence  of  signals  s  =  {st}^j,  where  each  agent  believes  that 
Pr'  [st  =  9\9)  =  p'  >  1/2.  Theorem  1  then  implies  that  after  observing  the  sequence  of  signals, 
the  agents  will  learn  9.  If  the  frequency  p  (s)  of  signal  with  sj  =  1  is  greater  than  1/2,  they 
will  learn  that  9—1;  otherwise  they  will  learn  that  9  =  0.  Up  (s)  <  1/2,  P  strictly  dominates 
a,  and  hence  (/3,/3)  is  the  dominant  strategy  equilibrium.  If  p{s)  >  1/2,  a  strictly  dominates 
P  and  (a,  a)  is  the  dominant  strategy  equilibrium.  Consequently,  when  they  learn  under 
certainty  before  playing  the  game,  the  expected  payoff  to  each  player  is  27r  -|-  (1  —  tt)  >  27r. 
This  implies  that,  if  they  have  the  option,  the  players  would  prefer  to  wait  for  the  arrival  of 
public  information  before  playing  the  game. 

Let  us  next  turn  to  learning  under  uncertainty.  In  particular,  suppose  that  the  agents  do 
not  know  the  signal  distribution  and  their  subjective  densities  are  similar  to  those  in  Example 

2: 

(l  -  e  -  e^)  /5    if  f  -5/2<p<f  +  5/2 

e  ifp<l/2  (18) 

^  otherwise 

for  each  9,  where  0  <  6  <  p^—p^  and  e  and  6  are  taken  to  be  arbitrarily  small  (i.e.,  we  consider 

the  limit  where  e  -^  0  and  (5  — >  0,  or  loosely  speaking,  where  e  =  0  and  6  =  0).   Recall  from 
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Example  2  that  when  e  =  0  and  5  =  0,  the  asymptotic  posterior  probabihty  of  ^  =  1  is 


'PLiPis)) 


a  p{s)  <  l-p'-5/2, 
1     or  1 -p'  + 6/2  <  p{s)  <  1/2, 
or  f- 5/2  <p{s)  <f  +  6/2, 


0     otherwise. 

As  discussed  above,  when  e  ^  0  and  6  =  0,  each  agent  beheves  that  he  will  learn  the  true 
value  of  9,  while  the  other  agent  will  reach  the  opposite  conclusion.  This  implies  that  both 
agents  expect  that  one  of  them  will  have  (/>^  ip{s))  —  1  while  the  other  has  4>]^  ip{s))  =  0. 
Consequently,  the  unique  equilibrium  will  be  (a, /3),  giving  both  agents  an  ex  ante  expected 
payoff  of  1/2,  which  is  strictly  less  than  the  expected  payoff  to  playing  the  game  before  the 
arrival  of  information  (which  is  27r).  Therefore,  when  there  is  learning  under  uncertainty,  both 
agents  may  prefer  to  play  the  game  before  the  arrival  of  public  information. 

4.2      Selection  in  Coordination  Games 

The  initial  difference  in  players'  beliefs  about  the  signal  distribution  need  not  be  due  to  lack 
of  common  prior;  it  may  be  due  to  private  information.  Building  on  an  example  by  Carlsson 
and  van  Damme  (1993),  we  now  illustrate  that  when  the  players  are  uncertain  about  the  signal 
distribution,  small  differences  in  beliefs,  combined  with  learning,  may  have  a  significant  effect 
on  the  outcome  of  the  game  and  may  select  one  of  the  multiple  equilibria  of  the  game. 
Consider  a  game  with  the  payoff  matrix 

_I N 

I 

N 

where  9  ~  A/"  (0,1).  The  players  observe  an  infinite  sequence  of  public  signals  s  =  {st}^o> 
where  St  G  {0, 1}  and 

PTist  =  l\9)  =  l/{l  +  exp{-i9  +  rj))),  (19) 

with  77  ~  A/'(0, 1).  In  addition,  each  player  observes  a  private  signal 

Xi=r]  +  Ui 

where  Ui  is  uniformly  distributed  on  [— e/2,  e/2]  for  some  small  e  >  0. 

Let  us  define  k  =  \og{p{s))  —  log(l  —  p{s)).  Equation  (19)  implies  that  after  observing  s, 
the  players  infer  that  9  +  t]  =  k.  For  small  e,  conditional  on  Xi,  77  is  distributed  approximately 
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9,9 

9-1,0 

0,9-1 

0,0 

uniformly  on  [xi  —  e/2,Xi  +  e/2]  (see  Carlsson  and  van  Damme,  1993).  This  implies  that  con- 
ditional on  Xi  and  s,  9  is  approximately  uniformly  distributed  on  [k  —  Xi  —  e/2,  k  —  Xi  +  e/2]. 
Now  note  that  with  the  reverse  order  on  Xi,  the  game  is  supermodular.  Therefore,  there  exist 
extremal  rationalizable  strategy  profiles,  which  also  constitute  monotone,  symmetric  Bayesian 
Nash  Equilibria.  In  each  equilibrium,  there  is  a  cutoff  value,  x*,  such  that  the  equilibrium  ac- 
tion is  /  if  Xj  <  X*  and  A'^  if  x,  >  x*.  This  cutoff,  x*,  is  defined  such  that  player  i  is  indifferent 
between  the  two  actions,  i.e., 

K.-X*  =  Pr(xj  >  x*\xi  =  X*)  =  1/2  +  0  (e) , 

where  O  (e)  is  such  that  lime_<o  O  (e)  =  0.  This  establishes  that 

X*  =  AC  -  1/2  -  O  (e) . 

Therefore,  when  e  is  small,  the  game  is  dominance  solvable,  and  each  player  i  plays  /  if 
Xi  <  K-  1/2  and  N  if  Xi  >  k  +  1/2. 

In  this  game,  learning  under  certainty  has  very  different  implications  from  those  above. 
Suppose  instead  that  the  players  knew  the  conditional  signal  distribution  (i.e.,  they  knew  rj), 
so  that  we  are  in  a  world  of  learning  under  certainty.  Then  after  s  is  observed,  9  would  become 
common  knowledge,  and  there  would  be  multiple  equilibria  whenever  6  G  (0, 1).  This  example 
therefore  illustrates  how  learning  under  uncertainty  can  lead  to  the  selection  of  one  of  the 
equilibria  in  a  coordination  game. 

4.3     A  Simple  Model  of  Asset  Trade 

One  of  the  most  interesting  applications  of  the  ideas  developed  here  is  to  models  of  asset 
trading.  Models  of  assets  trading  with  different  priors  have  been  studied  by,  among  others, 
Harrison  and  Kreps  (1978)  and  Morris  (1996).  These  works  assume  different  priors  about 
the  dividend  process  and  allow  for  learning  under  certainty.  They  establish  the  possibility  of 
"speculative  asset  trading" .  We  now  investigate  the  implications  of  learning  under  uncertainty 
for  the  pattern  of  speculative  asset  trading. 

Consider  an  asset  that  pays  1  if  the  state  is  A  and  0  if  the  state  is  B.  Assume  that  Agent 
2  owns  the  asset,  but  Agent  1  may  wish  to  buy  it.  We  have  two  dates,  r  =  0  and  t  —  I,  and 
the  agents  observe  a  sequence  of  signals  between  these  dates.  For  simplicity,  we  again  take  this 
to  be  an  infinite  sequence  s  =  {st}'^-^-  We  also  simplify  this  example  by  assuming  that  Agent 
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1  has  all  the  bargaining  power:  at  either  date,  if  he  wants  to  buy  the  asset,  Agent  1  makes 
a  take-it-or-leave-it  price  offer  P^,  and  trade  occurs  at  price  Pr  if  Agent  2  accepts  the  offer. 
Assume  also  that  tt^  >  tt^,  so  that  Agent  1  is  more  optimistic.  This  assumption  ensures  that 
Agent  1  would  like  to  purchase  the  asset.  We  are  interested  in  subgame-perfect  equilibrium  of 
this  game.  ^ 

Let  us  start  with  the  case  in  which  there  is  learning  under  certainty.  Suppose  that  each 
agent  is  certain  that  Pa  —  Pb  =  P^  for  some  number  p*  >  1/2.  In  that  case,  from  Theorem 
1,  both  agents  recognize  at  r  =  0  that  at  r  =  1,  for  each  p  (s),  the  value  of  the  asset  will  the 
same  for  both  of  them:  it  will  be  worth  1  if  ,0  (s)  >  1/2  and  0  if  p  (s)  <  1/2.  Hence,  at  r  =  1 
the  agents  will  be  indifferent  between  trading  the  asset  (at  price  Pi  —  (j)]^  [p  (s))  =  4>to  (P  i^))) 
at  each  history  p{s).  Therefore,  if  trade  does  not  occur  at  r  =  0,  the  continuation  value  of 
Agent  1  is  0,  and  the  continuation  value  of  Agent  2  is  tt^.  If  they  trade  at  price  Pq,  then  the 
continuation  value  of  agents  1  and  2  will  be  tt^  —  Pq  and  Pq,  respectively.  This  imphes  that  at 
date  0,  Agent  2  accepts  an  offer  if  and  only  if  Pq  >  ir^ .  Since  tt^  >  tt'^,  Agent  1  is  happy  to 
offer  the  price  Pq  =  '""^  at  date  r  =  0  and  trade  takes  place.  Therefore,  with  learning  under 
certainty,  there  will  be  immediate  trade  at  r  =  0. 

We  next  turn  to  the  case  of  learning  under  uncertainty  and  suppose  that  the  agents  do 
not  know  p.4  and  pB  ■  In  contrast  to  the  case  of  learning  under  certainty,  the  agents  now  have 
an  incentive  to  delay  trading.  To  illustrate  this,  we  first  consider  a  simple  example  where 
subjective  densities  are  as  in  Example  1,  with  e  — >  0.  Now,  at  date  1,  if  p^  —  5/2  <  p  (s)  < 
p^  +  5/2,  then  the  value  of  the  asset  for  Agent  2  is  <^^  {p  (s))  =  tt'^,  and  the  value  of  the  asset 
for  Agent  1  is  approximately  1.  Hence,  at  such  p{s),  Agent  1  buys  the  asset  from  Agent  2  at 
price  Pi  (p(s))  =  n'^,  enjoying  gains  from  trade  equal  to  1  —  tt^.  Since  the  equilibrium  payoff 
of  Agent  1  must  be  non- negative  in  all  other  contingencies,  this  shows  that  when  they  do  not 
trade  at  date  0,  his  continuation  value  is  at  least 

Tfl  (1  -  TT^) 

(when  e  ^  0).  The  continuation  value  of  Agent  2  must  be  at  least  tt^,  as  he  has  the  option 
of  never  selling  his  asset.  Therefore,  they  can  trade  at  date  0  only  if  the  total  payoff  from 
trading,  which  is  7r\  exceeds  the  sum  of  these  continuation  values,  n^  (l  —  tt'^)  +  tt^.  Since  this 
is  impossible,  there  will  be  no  trade  at  r  =  0.  Instead,  Agent  1  will  wait  for  the  information 
to  buy  the  asset  at  date  1  (provided  that  p  (s)  turns  out  to  be  in  a  range  where  he  concludes 
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that  the  asset  pays  1). 

This  example  exploits  the  general  intuition  discussed  after  Theorem  4:  if  the  agents  are 
uncertain  about  the  informativeness  of  the  signals,  each  agent  may  expect  to  learn  more  from 
the  signals  than  the  other  agent.  In  fact,  this  example  has  the  extreme  feature  whereby  each 
agent  believes  that  he  will  definitely  learn  the  true  state,  but  the  other  agent  will  fail  to  do 
so.  This  induces  the  agents  to  wait  for  the  arrival  of  additional  information  before  trading. 
This  contrasts  with  the  intuition  that  observation  of  common  information  should  take  agents 
towards  common  beliefs  and  make  trades  less  likely.  This  intuition  is  correct  in  models  of 
learning  under  certainty  and  is  the  reason  why  previous  models  have  generated  speculative 
.trade  at  the  beginning  (Harrison  and  Kreps,  1978,  and  Morris,  1996).  Instead,  here  there  is 
delayed  speculative  trading. 

The  next  result  characterizes  the  conditions  for  delayed  asset  trading  more  generally: 

Proposition  2  In  any  subgame-perfect  equilibrium,  trade  is  delayed  to  t  —  1  if  and  only  if 

E2[,/,y  =7r2>Ei[min{^^,0^}]. 

That  is,  when  n'^  >  E^  [min  {<?^^,(/'oo}]'  Agent  1  does  not  buy  at  t  =  0  and  buys  at  t  =  1  if 
'PIo  (p('S))  >  (plc  (p(s))>'  when  tt^  <  E^  [min  {(^^,  f/)^}],  Agent  1  buys  at  t  =  0. 

Proof.  In  any  subgame-perfect  equilibrium.  Agent  2  is  indiiTerent  between  trading  and  not, 
and  hence  his  valuation  of  the  asset  is  Pr^  {6  =  A | Information).  Therefore,  trade  at  r  =  0  can 
take  place  at  the  price  Pq  =  n'^,  while  trade  at  r  =  1  will  be  at  the  price  Pi  (p  (s))  —  0^  {p  (s)). 
At  date  1,  Agent  1  buys  the  asset  if  and  only  if  (p]^  {p  (s))  >  0^  {p  (s)),  yielding  the  payoff  of 
max{(/)^  {p{s))  —  (j?^  {p{s))  ,0}.  This  implies  that  Agent  1  is  willing  to  buy  at  r  =  0  if  and 
only  if 

TT^-TT^     >     Ei[max{(/.i,(p(s))-,^^(p(s)),0}] 

=    B^  [</)L  (P  (s))  -  min  {4>l,  {p  is)) ,  4>l  {p  is))}] 
=     tt' -E' [min  {(Plip  is)),  4>lip{s))}], 

as  claimed.    ■ 

Since  n^  ~  E^  [<?^io]  —  ^^  [min  {(/>^,  (fto]\  >  ^^^^  result  provides  a  cutoff  value  for  the  initial 
difference  in  beliefs,  tt^  —  tt^,  in  terms  of  the  differences  in  the  agents'  interpretation  of  the 
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signals.  The  cutoff  value  is  E-^  [max  [cf)^  {p  (s))  -  0^  (p  (s)) ,  O}] .  If  the  initial  difference  is 
lower  than  this  value,  then  agents  will  wait  until  r  =  1  to  trade;  otherwise,  they  will  trade 
immediately.  Consistent  with  the  above  example,  delay  in  trading  becomes  more  likely  when 
the  agents  interpret  the  signals  more  differently,  which  is  evident  from  the  expression  for  the 
cutoff  value.  This  reasoning'  also  suggests  that  if  Fg  =  Fg  for  each  6  (so  that  the  agents 
interpret  the  signals  in  a  similar  fashion), ^^  then  trade  should  occur  immediately.  The  next 
lemma  shows  that  each  agent  believes  that  additional  information  will  bring  the  other  agent's 
expectations  closer  to  his  own  and  will  be  used  to  prove  that  Fg  =  Fg  indeed  implies  immediate 
trading. 

Lemma  4  If  n^  >  tt^  and  Fg  =  Fg  for  each  9,  then 

El  [4>l]  >  n\ 

Proof.  Recall  that  ex  ante  expectation  of  individual  i  regarding  (j>'^  can  be  written  as 

E^C,]     =      f\-K^f\{p)<i>>^{p)  +  {l-i:')fh{l-p)4>^oo{p)]dp  (20) 

Jo 

f^n^f^{p)  +  {l-n^)fs{l-p)^    ^  ^^ 

~      Jo     ^^fAip)  +  il-n^)fBil-py^^'^'^'' 

where  the  first  line  uses  the  definition  of  ex  ante  expectation  under  the  probability  measure 
Pr',  while  the  second  line  exploits  equations  (3)  and  (4)  and  the  fact  that  since  Fg  =  Fg, 
fe  (P)  =  fe  (P)  =  fe  (p)  ^°^  ^11  p.  Now  define 

r(^\-    f    ^/-4  (P)  +  (1  -  ^) /b  (1  -  P)    .    f., 
^^^'=   I     ^2t    (r^^l^         2\  t    l^ \JA  [P)  dp. 

Jo     ^   JA  [P)  +  (1  -  TT^)  /s  (1  -  p) 

From  (20),  E^  [(p^,]  =  I  {n^)  and  tt^  =  E^  [0^]  =  /  (tt^).  Hence,  it  suffices  to  show  that  /  is 
increasing  in  n.  Now, 


JO      TT-^ 


^^^^^  {fA{p)-fB{l-p))dp. 


r2/.4(p)  +  (l-7r2)/B(l-p) 
Moreover,   /^  (p)  /  [tt^Ja  (p)  +  (l  -  tt^)  /b  (1  -  p)]    >    1  if  and  only  if  /^  (p)    >   /b  (1  -  p). 


'Recall  from  Theorem  3  that  even  when  Fq  =  Fg,  agents  interpret  signals  differently  because  tt'  ^  tt'. 
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Hence, 


fA{p) 


I'i^)    =     [  2f   (  \^(C'^\f   n -AfA{p)-fB{l-p))dp 

JjA>fB'^  fA[p)  +  {^-T^-)  fBi.1-  P) 

-/  2,  (r.^^!^^^''\^f   n       .Ub{1- p)- fA[p))dp 

JfA<fB  ^  /^  yP)  +  (1  -  ^  )  JB  (1  -  P) 

>       [  UA{p)-fB{l-p))dp-    f  UB{l-p)-fA{p))dp 

■JfA>fB  JfA<fB 

=      /   ifA{p)-fBil-p))dp^O. 
Jo 

■ 

Together  with  the  previous  proposition,  this  lemma  yields  the  following  result  establishing 

that  delay  in  asset  trading  can  only  occur  when  subjective  probability  distributions  differ  across 

individuals. 

Proposition  3  //  Fq  =  Fq  for  each  9,  then  m  any  subgame-perfect  equilibrium,  trade  occurs 
atT  =  0. 

Proof.  Since  n^  >  iP'  and  Fl}  —  i?^,  Lemma  1  imphes  that  (l)^{p{s))  >  4>1oip{s))  for 
each  p  (s).  Then,  E^  [min  {<?!>^,  (pto}]  ~  E^  [^to]  -  ""^i  where  the  last  inequality  is  by  Lemma 
4.  Therefore,  by  Proposition  2,  Agent  1  buys  at  r  =  0.    ■ 

This  proposition  establishes  that  when  the  two  agents  have  the  same  subjective  probability 
distributions,  there  will  be  no  delay  in  trading.  However,  as  the  example  above  illustrates, 
when  Fg  ^  Fg,  delayed  speculative  trading  is  possible.  The  intuition  is  given  by  Lemma  4: 
when  agents  have  the  same  subjective  probability  distribution  but  different  priors,  each  will 
believe  that  additional  information  will  bring  the  other  agent's  beliefs  closer  to  his  own.  This 
leads  to  early  trading.  However,  when  the  agents  differ  in  terms  of  their  subjective  probability 
distributions,  they  expect  to  learn  more  from  new  information  (because,  as  discussed  after 
Theorem  4  above,  they  believe  that  they  have  the  "correct  model  of  the  world").  Consequently, 
they  delay  trading. 

Learning  under  uncertainty  does  not  necessarily  lead  to  additional  delay  in  economic  trans- 
actions, however.  Whether  it  does  so  or  not  depends  on  the  effect  of  the  extent  of  disagreement 
on  the  timing  of  economic  transactions.  We  will  next  see  that,  in  the  context  of  bargaining, 
the  presence  of  learning  under  uncertainty  may  be  a  force  towards  immediate  agreement  rather 
than  delay. 
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4.4     Bargaining  With  Outside  Options 

Consider  two  agents  bargaining  over  the  division  of  a  dollar.  There  are  two  dates,  re  {0, 1}, 
and  Agent  2  has  an  outside  option  6  6  {^Li^h}  that  expires  at  the  end  of  date  1,  where 
Oi  <  9h  <  ^  and  the  value  of  9  is  initially  unknown.  Between  the  two  dates,  the  agents 
observe  an  infinite  sequence  of  public  signals  s  =  {stjt^i  with  sj  £  {a^,  an},  where  the  signal 
ai  can  be  thought  to  be  more  likely  under  6l- 

Bargaining  follows  a  simple  protocol:  at  each  date  r.  Agent  1  offers  a  share  Wr  to  Agent 
2.  If  Agent  2  accepts  the  offer,  the  game  ends.  Agent  2  receives  the  proposal,  Wr,  and  Agent 
1  receives  the  remaining  1  —  Wt-  If  Agent  2  rejects  the  offer,  she  decides  whether  to  take  her 
outside  option,  terminating  the  game,  or  wait  for  the  next  stage  of  the  game.  We  assume  that 
delay  is  costly,  so  that  if  negotiations  continue  until  date  r  =  1,  Agent  1  incurs  a  cost  c  >  0. 

Finally,  as  in  Yildiz  (2003),  the  agents  are  assumed  to  be  "optimistic,"  in  the  sense  that 

y  =  E^  [0]  -  E^  [61]  >  0. 

In  other  words,  they  differ  in  their  expectations  of  6  on  the  outside  option  of  Agent  2 — -with 
Agent  2  believing  that  her  outside  option  is  higher  than  Agent  1  's  assessment  of  this  outside 
option — and  y  parameterizes  the  extent  of  optimism  in  this  game. 

We  assume  that  the  game  form  and  beliefs  are  common  knowledge  and  look  for  the  subgame- 
perfect  equilibrium  of  this  simple  bargaining  game. 

By  backward  induction,  at  date  r  =  1,  for  any  p{s),  the  value  of  outside  option  for  Agent 
1  is  E^  [9\p  (s)]  <  1,  and  hence  she  accepts  an  offer  wi  if  and  only  if  wi  >  E'^  [9\p  (s)].  Agent  2 
therefore  offers  wi  =  E^  [9\p  (s)].  If  there  is  no  agreement  at  date  0,  the  continuation  values  of 
the  two  agents  are: 

yi  =  1  _  c  -  E^  [E^  {9\p  (s)]]       and      V^  =  E^  [E^  [9\p  (s)]]  =  E^  [9] , 

which  uses  the  fact  that  there  is  no  cost  of  delay  for  Agent  2.  Since  they  have  1  dollar  in  total, 
the  agents  will  delay  the  agreement  to  date  r  =  1  if  and  only  if 

e2  [9]  -  E^  [E^  [e\p  (s)]]  >  c. 

Here,  E^  [E^  [0|/9(s)]]  is  Agent  I's  expectation  about  how  Agent  2  will  update  her  beliefs 
after  observing  the  signals  s.    If  Agent  1  expects  that  the  information  will  reduce  Agent  2's 
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expectation  of  her  outside  option  more  than  the  cost  of  waiting,  then  Agent  1  is  wilhng  to  wait. 
This  description  makes  it  clear  that  whether  there  will  be  agreement  at  date  r  =  0  depends 
on  Agent  I's  assessment  of  how  Agent  2  will  interpret  the  (public)  signals. 

When  each  agent  is  certain  about  the  informativeness  of  the  signals,  they  agree  ex  ante 
that  they  will  interpret  the  information  correctly.  Consequently,  as  in  Lemma  4  in  the  previous 
subsection.  Agent  I's  Bayesian  updating  wall  indicate  that  the  public  information  will  reveal 
him  to  be  right.  Yildiz  (2004)  has  shown  that  this  reasoning  gives  Agent  1  an  incentive  to 
"wait  to  persuade"  Agent  2  that  her  outside  option  is  relatively  low.  More  specifically,  assume 
that  each  agent  i  is  certain  that  Pr*  (sj  —  9\9)  —  p^  >  1/2  for  some  p^  and  p-^,  where  p^  and  p^ 
may  differ.  Then,  from  Theorem  1,  the  agents  agree  that  Agent  2  will  learn  her  outside  option, 
i.e.,  Pr'  (E2  {9\p{s)]  ^  9)  =  1  for  each  i.  Hence,  E^  [E^  [9\p{s)]]  =  E^  [6].  Therefore,  Agent  1 
delays  the  agreement  to  date  r  =  1  if  and  only  if 

y>c, 

i.e.,  if  and  only  if  the  level  of  optimism  is  higher  than  the  cost  of  waiting.  This  discussion 
therefore  indicates  that  the  arrival  of  public  information  can  create  a  reason  for  delay  in 
bargaining  games. 

We  now  show  that  when  agents  are  uncertain  about  the  informativeness  of  the  signals,  this 
motive  for  delay  is  reduced  and  there  can  be  immediate  agreement.  Intuitively,  each  agent 
understands  that  the  same  signals  will  be  interpreted  differently  by  the  other  agent  and  thus 
expects  that  they  are  less  likely  to  persuade  the  other  agent.  This  decreases  the  incentives  to 
delay  agreement. 

This  result  is  illustrated  starkly  here,  with  an  example  where  a  small  amount  of  uncer- 
tainty about  the  informativeness  of  signals  removes  all  incentives  to  delay  agreement.  Suppose 
that  the  agents'  beliefs  are  again  as  in  Example  1  with  e  small.  Now  Agent  1  assigns  proba- 
bility more  than  1  —  e  to  the  event  that  that  p  (s)  will  be  either  in  [p  —  6/2,p^  +  5/2]  or  in 
[l  —  p  —  (5/2, 1  —  p^  -I-  (5/2] ,  inducing  Agent  2  to  stick  to  her  prior.  Hence,  Agent  1  expects 
that  Agent  2  will  not  update  her  prior  by  much.  In  particular,  we  have 

E^  [e2[0|p(s)]]  =E^[9]  +  0{e). 

Thus 

E^  [9]  -  E^  [e2  [^Ip  (s)]]  =  -O  (e)  <  c. 
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This  implies  that  agents  will  agree  at  the  beginning  of  the  game.  Therefore,  the  same  forces  that 
led  to  delayed  asset  trading  in  the  previous  subsection  can  also  induce  immediate  agreement 
in  bargaining  when  agents  are  "optimistic" . 

4.5     Manipulation  and  Uncertainty 

Our  final  example  is  intended  to  show  how  the  pattern  of  uncertainty  used  in  the  body  of  the 
paper  can  result  from  game  theoretic  interactions  between  an  agent  and  an  informed  party,  for 
example  as  in  cheap  talk  games  (Crawford  and  Sobel,  1982).  Since  our  purpose  is  to  illustrate 
this  possibility,  we  choose  the  simplest  environment  to  communicate  these  ideas  and  limit  the 
discussion  to  the  single  agent  setting — the  generalization  to  the  case  with  two  or  more  agents 
is  straightforward. 

The  environment  is  as  follows.  The  state  of  the  world  is  0  G  {0, 1},  and  the  agent  starts 
with  a  prior  belief  tt  G  (0, 1)  that  0  =  1  at  t  =  0.  At  time  i  =  1,  this  agent  has  to  make  a 
decision  x  e  [0, 1],  and  his  payoff  is  —  (x  —  0)  .  Thus  the  agent  would  like  to  form  as  accurate 
an  expectation  about  Q  as  possible. 

The  other  player  is  a  media  outlet,  M,  which  observes  a  large  (infinite)  number  of  signals 
s'  =  {sj}^j  with  s\  €  {0, 1},  and  makes  a  sequence  of  reports  to  the  agent  s  =  {s(}^j 
with  St  G  {0, 1}.  The  reports  s  can  be  thought  of  as  contents  of  newspaper  articles,  while  s' 
correspond  to  the  information  that  the  newspaper  collects  before  writing  the  articles.  Since  s' 
is  an  exchangeable  sequence,  we  can  represent  it,  as  before,  with  the  fraction  of  signals  that 
are  I's,  denoted  by  p'  G  [0, 1],  and  similarly  s  is  represented  by  p  G  [0, 1].  This  is  convenient 
as  it  allows  us  to  model  the  mixed  strategy  of  the  media  as  a  mapping 

(tm:  [0,1]^  A  ([0,1]), 

where  A  ([0,1])  is  the  set  of  probability  distributions  on  [0,1].  Let  i  be  the  strategy  that 
puts  probability  1  on  the  identity  mapping,  thus  corresponding  to  M  reporting  truthfully. 
Otherwise,  i.e.,  if  ctm  t^  i,  there  is  manipulation  (or  misreporting)  on  the  part  of  the  media 
outlet  M.1'5 

We  also  assume  for  simplicity  that  p'  has  a  continuous  distribution  with  density  gi  when 
6  —  1  and  go  when  9  =  0,  such  that  g\  {p)  =  0  for  ed\  p  <  p  and  gi  (p)  >  0  for  all  p  >  p,  while 
go  (p)  >  0  for  all  p  <  p  and  go  (p)  =  0  for  all  p  >  p-  This  assumption  implies  that  if  M  reports 


'  See  Baron  (2004)  and  Gentzkow  and  Shapiro  (2006)  for  related  models  of  media  bias. 
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truthfully,  i.e.,  om  =  ^  then  Theorem  2  apphes  and  there  will  be  asymptotic  learning  (and 
also  asymptotic  agreement  when  there  are  more  than  one  agent). 

Now  suppose  instead  that  there  are  three  different  types  of  player  M  (unobservable  to  the 
agent).  With  probabihty  A//  6  (0, 1),  the  media  is  honest  and  can  only  play  a^j  =  i  (where 
the  superscript  is  for  type  H — honest).  With  probability  Aq  €  (0, 1  —  A//),  the  media  outlet  is 
of  type  a  and  is  biased  towards  1.  Type  a  media  outlet  receives  utility  equal  to  x  irrespective 
of  p' ,  and  hence  would  like  to  manipulate  the  agent  to  choose  high  values  of  x.  With  the 
complementary  probability  A/3  =  \  —  Xa  —  \h,  the  media  outlet  is  of  type  j3  and  is  biased 
towards  0,  and  receives  utility  equal  to  I  —  x. 

Let  us  now  look  for  the  perfect  Bayesian  equilibrium  of  the  game  between  the  media  outlet 
and  the  agent.  The  perfect  Bayesian  equilibrium  can  be  represented  by  two  reporting  functions 
a%;  :  [0, 1]  ^  A  ([0, 1])  and  cr^^  :  [0, 1]  ^  A  ([0, 1])  for  the  two  biased  types  of  M,  and  updating 
function  (p  :  [0,1]  -^  [Oil]i  which  determines  the  belief  of  the  agent  that  9=1  when  the 
sequence  of  reports  is  p,  and  an  action  function  x  :  [0,1]  — >  [0,1],  which  determines  the 
choice  of  the  agent  as  a  function  of  p  (there  is  no  loss  of  generality  here  in  restricting  to  pure 
strategies). 

In  equilibrium,  x  must  be  optimal  for  the  agent  given  (p;  <p  must  be  derived  from  Bayes 
rule  given  o""^,  a^,^  and  the  prior  tt;  and  a"^  and  ct^^  must  be  optimal  for  the  two  biased  media 
outlets  given  x. 

Note  first  that  since  the  payoff  to  the  biased  media  outlet  does  not  depend  on  the  true 
p',  without  loss  of  generality,  we  can  restrict  a^  and  a j^^  not  to  depend  on  p' .  Then,  with  a 
shght  abuse  of  notation,  let  <t^  (p)  and  a^  (p)  be  the  respective  densities  with  which  these 
two  types  report  p. 

Second,  the  optimal  choice  of  the  agent  after  observing  a  sequence  of  signals  with  fraction 
p  being  equal  to  1  is 

xip)^(l>  {p) , 

for  all  p  e  [0, 1],  i.e.,  the  agent  will  choose  an  action  equal  to  his  behef  (f>{p). 
Third,  an  application  of  Bayes'  rule  implies  the  following  belief  for  the  agent: 


4>{p)  =  { 


TAHSl(p)+AaCr^(p)+A;30-M{P) 


if  p<p 

(21) 
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The  following  lemma  shows  that  any  (perfect  Bayesian)  equilibrium  has  a  very  simple  form: 

Lemma  5  In  any  equilibrium,  there  exist  (j)^  >  n  and  (f>g  <  n  such  that  (p  (p)  =  (f)^  for  all 
p  <  p  and  (p  (p)  =  (pA  /'^^  O'^l'  P  >  P- 

Proof.  Prom  (21),  4>{p)  <:  n  when  p  <  p,  and  (f>{p)  >  tt  when  p  >  p.  Since  the  media  type 
a  maximizes  x{p)  =  (j)  {p),  we  have  cr^  (p)  =  0  iov  p  <  p.  Now  suppose  that  the  lemma  is  false 
and  there  exists  Pi,p2  <  P  such  that  (piPi)  >  't'iPi)-  Then  we  also  have  a^  (p^)  --  0 — since 
media  type  (3  minimizes  x  {p)  =  (f)  {p).  But  in  that  case,  equation  (21)  implies  that  (j)  [pi)  —  0, 
contradicting  the  hypothesis.  Therefore,  (l){p)  is  constant  over  p  e  [0,p).  The  proof  for  (j){p) 
being  constant  over  p  e  (p,  1]  is  analogous.    ■ 

It  follows  immediately  from  this  lemma  that  equilibrium  beliefs  will  take  the  form  given  in 
the  next  proposition: 

Proposition  4  Suppose  that  p  ^p,  then  the  unique  equilibrium  actions  and  beliefs  are: 

^M  (P)  =  91  (P)  (22) 

■        a^(p)^ffo(p)  (23) 

x{p)  =  cp  (p)  =  <^  (24) 

I  ^^fiSfe^    ^fp>-p- 

Proof.  Consider  the  case  p  <  p.  As  in  the  proof  of  Lemma  5,  a^^  (p)  =  0.  Since  (j)  (p)  is 
constant  over  p  G  [0,  p)  (by  Lemma  5),  equation  (21)  implies  that  a^  is  proportional  to  go  on 
this  range.  Since  this  range  is  the  common  support  of  the  densities  a^  and  50,  it  must  be  that 
^M  ~  9o-  Similarly,  a^^  =  gi.  Substituting  these  equalities  in  (21),  we  obtain  (24).    ■ 

This  proposition  implies  that  the  unique  equilibrium  of  the  game  between  the  media  outlet 
and  the  agent  leads  to  a  special  case  of  our  model  of  learning  under  uncertainty.  In  particulaj, 
the  beliefs  in  (24)  can  be  obtained  by  the  appropriate  choice  of  the  functions  /^  (•)  and  fs  (•) 
from  equation  (3)  in  Section  2.  This  illustrates  that  the  type  of  learning  under  uncertainty 
analyzed  in  this  paper  is  likely  to  emerge  in  game-theoretic  situations  where  one  of  the  players 
is  trying  to  manipulate  the  beliefs  of  others. 


44 


5      Concluding  Remarks 

A  key  assumption  of  most  theoretical  analyses  is  that  individuals  have  a  "common  prior,"  mean- 
ing that  they  have  beliefs  consistent  with  each  other  regarding  the  game  forms,  institutions, 
•  and  possible  distributions  of  payoff-relevant  parameters.  This  presumption  is  often  justified  by 
the  argument  that  sufficient  common  experiences  and  observations,  either  through  individual 
observations  or  transmission  of  information  from  others,  will  eliminate  disagreements,  taking 
agents  towards  common  priors.  This  presumption  receives  support  from  a  number  of  well- 
known  theorems  in  statistics  and  economics,  for  example.  Savage  (1954)  and  Blackwell  and 
Dubins  (1962). 

Nevertheless,  existing  theorems  apply  to  environments  in  which  learning  occurs  under  cer- 
tainty, that  is,  individuals  are  certain  about  the  meaning  of  different  signals.  In  many  situa- 
tions, individuals  are  not  only  learning  about  a  payoff-relevant  parameter  but  also  about  the 
interpretation  of  different  signals.  This  takes  us  to  the  realm  of  environments  where  learning 
takes  place  under  uncertainty.  For  example,  many  signals  favoring  a  particular  interpretation 
might  make  individuals  suspicious  that  the  signals  come  from  a  biased  source.  We  show  that 
learning  in  environments  with  uncertainty  may  lead  to  a  situation  in  which  there  is  lack  of 
full  identification  (in  the  standard  sense  of  the  term  in  econometrics  and  statistics).  In  such 
situations,  information  will  be  useful  to  individuals  but  may  not  lead  to  full  learning. 

This  paper  investigates  the  conditions  under  which  learning  under  uncertainty  will  take 
individuals  towards  common  priors  (or  asymptotic  agreement).  We  consider  an  environment 
in  which  two  individuals  with  different  priors  observe  the  same  infinite  sequence  of  signals 
informative  about  some  underlying  parameter.  Our  environment  is  one  of  learning  under  un- 
certainty, since  individuals  have  non-degenerate  subjective  probability  distribution  over  the 
likelihood  of  different  signals  given  the  values  of  the  parameter.  We  show  that  when  subjective 
probability  distributions  of  both  individuals  have  full  support  (or  in  fact  under  weaker  assump- 
tions), they  will  never  agree,  even  after  observing  the  same  infinite  sequence  of  signals.  We 
also  show  that  this  corresponds  to  a  result  of  "agreement  to  eventually  disagree" ;  individuals 
will  agree,  before  observing  the  sequence  of  signals,  that  their  posteriors  about  the  underlying 
parameter  will  not  converge.  This  common  understanding  that  more  information  may  not  lead 
to  similar  beliefs  for  agents  has  important  implications  for  a  variety  of  games  and  economic 
models.  Instead,  when  there  is  no  full  support  in  subjective  probably  distributions,  asymptotic 
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learning  and  agreement  may  obtain. 

An  important  implication  of  this  analysis  is  that  after  observing  the  same  sequence  of 
signals,  two  Bayesian  individuals  may  end  up  disagreeing  more  than  they  originally  did.  This 
result  contrasts  with  the  common  presumption  that  shared  information  and  experiences  will 
take  individuals'  assessments-^Ioser  to  each  other. 

We  also  systematically  investigate  whether  asymptotic  agreement  obtain  as  the  amount  of 
uncertainty  in  the  environment  diminishes  (i.e.,  as  we  look  at  families  of  subjective  probability 
distributions  converging  to  degenerate  limit  distributions  with  all  their  mass  at  one  point). 
We  provide  a  complete  characterization  of  the  conditions  under  which  this  will  be  the  case. 
Asymptotic  disagreement  may  prevail  even  under  "approximate  certainty,"  as  long  as  the  family 
of  subjective  probability  distributions  converging  to  a  degenerate  distribution  (and  thus  to  an 
environment  with  certainty)  has  regularly-varying  tails  (such  as  for  the  Pareto,  the  log-normal 
or  the  t-distributions) .  In  contrast,  with  rapidly- varying  tails  (such  as  the  normal  and  the 
exponential  distributions),  convergence  to  certainty  leads  to  asymptotic  agreement. 

Lack  of  common  beliefs  and  common  priors  has  important  implications  for  economic  behav- 
ior in  a  range  of  circumstances.  We  illustrate  how  the  type  of  learning  outhned  in  this  paper 
interacts  with  economic  behavior  in  various  different  situations,  including  games  of  coordina- 
tion, games  of  common  interest,  bargaining,  asset  trading  and  games  of  communication.  For 
example,  we  show  that  contrary  to  standard  results,  individuals  may  wish  to  play  common- 
interest  games  before  rather  than  after  receiving  more  information  about  payoffs.  Similarly,  we 
show  how  the  possibility  of  observing  the  same  sequence  of  signals  may  lead  to  "speculative  de- 
lay" in  asset  trading  among  individuals  that  start  with  similar  beliefs.  We  also  provide  a  simple 
example  illustrating  why  individuals  may  be  uncertain  about  informativeness  of  signals — the 
strategic  behavior  of  other  agents  trying  to  manipulate  their  beliefs. 

The  issues  raised  here  have  important  implications  for  statistics  and  econometrics  as  well 
as  learning  in  game-theoretic  situations.  As  noted  above,  the  environment  considered  here 
corresponds  to  one  in  which  there  is  lack  of  full  identification.  Nevertheless,  Bayesian  posteriors 
are  well-behaved  and  converge  to  a  limiting  distribution.  Studying  the  limiting  properties  of 
these  posteriors  more  generally  and  how  they  may  be  used  for  inference  in  under-identified 
econometric  models  is  an  interesting  area  for  research. 
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6     Appendix:  Omitted  Proofs 

Proof  of  Theorem  1.     Under  the  hypothesis  of  the  theorem  and  with  the  notation  in  (2),  we  have 


l-2r„/n' 


which  converges  to  0  or  oo  depending  on  Mmn^aofn/n  is  greater  than  1/2  or  less  than  1/2.  If 
lim„_oorn  (s)  /n  >  1/2,  then  by  (2),  limn^co'Pl  (s)  =  hm„_.c>o  (pl  (s)  =  1,  and  if  lim„_,oor„  (s)  /n  <  1/2, 
then  lim„_,oo'An  (•s)  =  liin„_,oo  "An  (*)  ~  0.  Since  lim^_oo^n  (s)  /n  =  1/2  occurs  with  probability  zero, 
this  shows  the  second  part.  The  first  part  follows  from  the  fact  that,  according  to  each  i,  conditional 
on  6  =  A,  lim„_,oor„  (s)  /n  =  p^  >  1/2.    ■ 

Proof  of  Lemma  3.    The  proof  is  identical  to  that  of  Lemma  1.    ■ 

Proof  of  Theorem  6. 

(Parti)  This  part  immediately  follows  from  Lemma  3,  as  each  TT\>f^k'  {p{s))  is  positive,  and 
■^ifA^ipis))  is  finite. 

(Part  2)  Assume  Fg   =  Fg  for  each  9  e  @.    Then,  by  Lemma  3,  (pl^ao  (p)  ~  "Afc.oolp)  =  0  if  and 

only  if  (Tfc  (tt^)  —  T^  (tt^))  Tfc  (  (/e(p))gg@)  =  0.   The  latter  inequality  has  probabiUty  0  under  both 

probability  measures  Pr^  and  Pr^  by  hypothesis.    ■ 

Proof  of  Theorem  7.     Define  tt  =  (1/A', . . . ,  l/K).  First,  take  n^  =  tt-  =  7f.  Then, 

i^lf\.[p{s)) .lf\Ap{s))         =  '  V'  {^^'  (P(^))).eej  -  T.  [{fe  (p(^))),,ej J  ^  «, 

where  1  s  (1, . . . ,  1)  ,  and  the  inequality  follows  by  the  hypothesis  of  the  theorem.  Hence,  by  Lemma  3, 
kI,oo  (P is))  -  <t>ioc  (P  (s))|  >  0  for  each  p  {s)  e  [0, 1].  Since  [0, 1]  is  compact  and  [0^,^  (p  (s))  -  0^_^  (p  {s))\ 
is  continuous  in  p  (s),  there  exists  e  >  0  such  that  \(i>l^oo  (P  (■s))  ~  0fc,oo  (P  (■5))|  >  ^  for  each  p  (s)  E  [0, 1]. 
Now,  since  I^J-oq  (p(s))  -  0fc^oo  (p(s))|  is  continuous  in  tt^  and  tt^,  there  exists  a  neighborhood  N  {n) 
of  7f  such  that 

I^I.c^oIpIs)) -<Afc,oo(p(s))|  >  kfc-TT^I  for  each  A;  =    1, ..., /-C  and  s  G  5 

for  all  n^ ,  tt'^  €  N  {ft).  Since  Pr'  (5)  =  1,  the  last  statement  in  the  theorem  follows.    ■ 

Proof  of  Theorem  8.     Our  proof  utilizes  the  following  two  lemmas. 
Lemma  A. 

lim     0loo,m(P) 


m^oo 


k\ 


^  +  Zk'^k^RiP-Pih^'')^P-p(^,A>n) 
Proof.  By  condition  (i),  liinm~>oo  c  (i,  A'' ,m)  =  1  for  each  i  and  k.  Hence,  for  every  distinct  k  and 

"^^°°  /V  vP)       "1-.00   c(i,A'-,m)    m^oo     j  [m{p  —  p{i,A'')))  V  V  /  ^         V 

Then,  Lemma  A  follows  from  Lemma  3.  ■ 
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Lemma  B.  For  any  i  >  Q  and  h  >  0,  there  exists  rh  such  that  for  each  m>  m,  k  <  K,  and  each 
p  (s)  with  \\p  (s)  -  p  (i,  A'')  II  <  h/m, 

|4,oo,m  (P  (^))  -    lim   0U,-  (P  (^  ^') )  I  <  ^-  (25) 

I  m — *oo  I 

Proof.  Since,  by  hypothesis,  R  is  continuous  at  each  {p{i,6)  —  'p{j,9')  ,p{i,9)  —  p{j,9)),  by 
Lemma  A,  there  exists  h'  >  0,  such  that 

I  hm   <f>l^^^  {p is))  -    Hm  4>l,oo,m  {P  {h  A'))\  <  e/2  (26) 

and  by  condition  (iii),  there  exists  rh  >  h/h'  such  that 

UU,-(P(^))-    1™   4,^,„(p(s))|<£/2.  (27) 

holds  uniformly  in  ||p  (s)  —  p  (i,  A'^)  ||  <  h' .    The  inequalities  in  (26)  and  (27)  then  imply  (25).  ■ 

(Proof  of  Part  1)  Since  R  ip  (i,  A'')  -  p  (i,  A^'  j  ,  0  j  =  0  for  each  k'  =/^  k  (by  condition  (i)),  Lemma 

A  implies  that  lim„^oo<?ifc,oo,m(p(^^''))  =  1-  Hence,  lim^^oo  (^I-.oo.m  (p(^.^''))  "  "^i.oo.m  (P  (^' ^'''))) 
0  if  and  only  if  limm^oo  <Afc  oo  m  (p  (*'  ^'^))  ~  ■'■•  Since  each  ratio  7r;^//7r;^  is  positive,  by  Lemma  A,  the 
latter  holds  if  only  if  R  ip  (i,  A'^)  -  p  ij,  A'''  j  ,p  (i.  A'')  -  p  (j.  A'')]  =  0  for  each  k'  ^  k,  establishing 
Part  L 

(Proof  of  Part  2)  Fix  e  >  0  and  5  >  0.  Fix  also  any  i  and  k.  Since  each  7rj.,/7ri-  is  finite,  by 
Lemma  3,  there  exists  e'  >  0,  such  that  0fc_oo,m  (p(*))  >  1  —  e  whenever  /'^,  {p{s))  / f\k  (p(s))  <  e' 
holds  for  every  k'  ^  k.  Now,  by  (i),  there  exists  /io,fc  >  0,  such  that 

Pr^  (||p  {s)  -p{i,  A'')  II  <  ho,k/m\e  =  A'')  =   f  f  {x)dx  >  {1  -  5) . 

Let 

Qk,m  =  {p  e  A  (L)  ;  ||p  -  p  (i.  A'')  II  <  ho,k/m} 

and  K  =  min||3,||</(j,  t,  f  [x]  >  0.  By  (i),  there  exists  hi^k  >  0  such  that,  whenever  ||x||  >  /ii_fc,  /  (a;)  < 
e'K/2.   There  exists  a  sufficiently  large  constant  mi^k  such  that  for  any  m  >  mi,fc,  p{s)  £  Qk,m,  and 

any  k'  /  k,  we  have  \\p{s)  —  p  (i,A''  J     >  hi^k/Tn,  and 

/(m(p(.)-p(z,^^-')))       ^1^^ 
/(m(p(s)-p(i,A'=)))      ^    2   K       2- 

Moreover,  since  limm^oo  c  (i,  ^,  m)  =  1  for  each  i  and  ^,  there  exists  Tn2,fc  >  mi^fc  such  that 
c  (i,  a''  ,  m )  /c  (i,  ^'^ ,  to)  <  2  for  every  k'  -^  k  and  m  >  vfii^k ■  This  implies 

/>(p(s))/A4p(^))<«', 

establishing  that 

4,oo,^(p(s))>l-e-  (28) 

Now,  for  j  7^  i,  assume  that  R{p(i,9)  —p{j,6')  ,p{i,d)  —p(j,0))  =  0  for  each  distinct  9  and  9'. 
Then,  by  Lemma  A,  limm-.oo  'Pk  oo  m{P  (*'  ^'^))  ~  ■'■'  ^^'^  hence  by  Lemma  B,  there  exists  ma^fc  >  TO2,fc 
such  that  for  each  m  >  m^^k,  p{s)  6  Qk,m, 

4>ioo,miPis))>l-e.  (29) 
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Notice  that  when  (28)  and  (29)  hold,  we  have  H^iJcm  (■')  ~  0L,m  (■s)||  <  ^-  Then,  setting  m  =  naax^t  rm^k, 
we  obtain  the  desired  inequaht)'  for  each  m  >  fh: 


Pr^  {\\4>l,m  is)  -  4>l,m  is)\\   <  e)       =        E  ^''  (li-^-,-  (^)  -  <m  (s)\\  <  ^1^  =  A'^)  Pt'  [9  =  A' 

k<K 

k<K 


k<K 

=    1-6. 


(Proof  of  Part  3)  Assume  that  R  (p {i,  9)  -  p  (j,  6')  ,p{i,e)-p  {j,  9))  j^  0  for  each  distinct  9  and 
9'.  Then,  since  each  7r^.,/7r^  is  positive,  Lemma  A  imphes  that  limm_oo  (Pi  cxmiP  (^'  ^'^))  ^  ^  ^°^  ^ach 
■fc.  Let 

K  I.  /Tt — 'OO  J 

Then,  by  part  2,  for  each  k,  there  exists  m2,fc  such  that  for  every  m  >  m2,k  and  p  [s)  G  Qk.m,  we  have 
•^fc.oo  (pC^s))  >  1  —  £■  By  Lemma  B,  there  also  exists  m^^k  >  Tn2,k  such  that  for  every  m  >  m^^k  and 
p{s)  6  Qfc,„j, 

€.oo,m  (P  («))  <  ^l™,  4,oc,m  (P  (i,  ^'))  +  £  <  1  -  2e  <  4,^  (p  (s))  -  6. 

This  imphes  that  ||<p^,m  (p(s))  —  ^^.m  (pt''))!!  >  ^-  Setting  tti  =  maxf^ms^fc  and  changing 

IkL.m  (s)  -  <Plo,-m  {s)\\  <  e  at  the  end  of  the  proof  of  Part  2  to  ||(/>^_„  (s)  -  0^_„  (s)||  >  e,  we  obtain 

the  desired  inequality.    ■ 
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